[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Message 00043: Re: FYI: catalog.loc.gov

> > Have you seen this?
> > http://bulk.resource.org/gpo.gov/Harvest_Report.html
> Yeah. What I couldn't quite figure out is where the data came from
> originally, which I'm really curious about. It looks cool, though.

It is a full drain of gpo.gov ... they maintain a bunch of, believe
it or not, WAIS databases.  They make that visible via the GPO
Access interface which presents a WAISgate interface to the world.
We went through that interface about 10 million times getting as
many pdf docs as we can out of it.  The cool thing is we told GPO
we were doing it and they didn't object.  This stuff is in alpha
still and we'll make it more by early next year.