heh, poor guy. (dunno why he couldn't just send you a drive, tho)
glad to see the stuff showing up in Google; wondering how
machine-parseable the OCRed PDFs will be for the 501c3s. It looks like
Guidestar has people do data-entry by hand.
I'm currently trying to get BK to let me scan the personal financial
disclosures for members of congress as part of the kahle-omidyar
govdocs project; if we get those the plan is to mturk them.
On Mon, Oct 6, 2008 at 6:36 PM, Carl Malamud <xxxxxxx@media.org> wrote:
irs says they sent me my 6 tbytes of data on dvds today. took them
3 months
to fill the order ... evidently there is some dude in utah who does
all of
these by himself and I screwed up his whole summer.
guess I better figure out how to do a dvd jukebox or this is going
to get
old really quickly ... 1500 dvds to copy and run through ocr. but,
this is
starting to work. My 527 stuff is starting to show up in google:
http://www.google.com/search?q=%22political+organization%22++site%3Abulk.resource.org
Carl