Message 00214: Re: pacer program

I've got a quickie for Tim. How do you get the info for your case listings, and are they definitely exhaustive back to 2004? Do you have a special XML feed from the PACER folks or do you just somehow parse the site? If the answer is the former, there a special agreement involved?

And a quick follow-up for Carl: I believe we already have the "magic pacer header" turned on. If it appears that this is not the case, please let us know.

On Sep 6, 2008, at 7:03 PM, Carl Malamud wrote:
Tim Stanley (who drains pacer at Justia on a proxy basis) and John Joergensen (librarian at Rutgers who is organizing the court reporters on the east coast), please meet two very talented new recruits, certified MIT rocket scientists. ;)

Aaron and Stephen have decided to adopt a local district court and then take advantage of the local pacer "public" trial to systematically grab all opinions for their jurisdiction and then put them on bulk.resource.org. I've given Aaron an account. Once we have an archive of their data, we'll scrub it for SSNs, then figure out how to inform the chief judge that we have his data available if he wants it.

Tim, can you review sample docs that they harvest? John, I wanted to make you aware we have a couple ringers helping kick this off. Both these guys are highly clueful. I've asked them to a) turn on the magic pacer header on the top of the pdf and b) embed the information we need for a unique id in the metadata for the pdf file.


On Sep 6, 2008, at 3:57 PM, Aaron Swartz wrote:

Can you introduce us to Tim Stanley? Schultze is trying to figure out
which cases to focus on with his Thumb Drive Corps members and is
looking at the list of case names in Justia and wondering how it was
generated. Perhaps Tim can also review some sample output and make
sure we're getting the right stuff.