[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Message 00610: Re: Fellow Pacer Downloader

oof ... you could have uploaded those for her to one of your servers.

i'm not sure what to do with those. was really hoping people would take some of the existing district court tarballs and put them into real services, but so far, nobody has done anything.

have you looked at the data? the big issue we have right now is that people usually don't have any metadata like your crawler gave us ... instead, you get a bunch of PDFs that have no metadata in the headers.

I'd love to figure out how to accept these contributions, but I'm not sure we can do anything useful.

That said, I will write her back and see if I can grab her data. But, this is definitely not a scalable operation the way it is now, so you'll definitely want to field the next one of these yourself. :)

On Feb 18, 2009, at 3:43 PM, Aaron Swartz wrote:

Hi, Adria. Let me introduce you to Carl Malamud, who runs resource.org

Carl, Adria has 7000 PACER documents she'd like to contribute.

On Wed, Feb 18, 2009 at 6:42 PM, Adria H <xxxxxxx@ymail.com> wrote:
FTP, since it's more than 5GB.

I just went to resource.org - under the pacer folder, there isn't a folder created for bankruptcy court. These are all bankruptcy court docs. Would you
create a new folder for bankruptcy courts?

Not wanting to sound too cautious, but are there still investigations of
resource.org ongoing?

Do you have plans to improve the search capabilities for resource.org? Seems like one can use Goog site search or write a script, but for true "public access" to legal docs, a user-friendly search platform is key (I'll like normal people besides Stanford geeks like me to use resource.org). Thoughts?


From: Aaron Swartz <me@aaronsw.com>
To: Adria H <xxxxxxx@ymail.com>
Sent: Wednesday, February 18, 2009 6:29:00 PM
Subject: Re: Fellow Pacer Downloader

I read the NY Times article about your efforts regarding Pacer downloads.
have downloaded more than 7,000 documents from Pacer using the designated libraries (until they were suspended). It's nothing compared to your feat. Anyway, do you have a website or platform for putting these documents
is now cleared by Legal)?

That's great. Yes, the documents are being published on resource.org.
What's a convenient mechanism for you to send them?