Subject: pacer

Message 00306: pacer

To: Stephen Schultze <xxxxxxxxxx@cyber.law.harvard.edu>, Aaron Swartz <xx@aaronsw.com>
Subject: pacer
From: Carl Malamud <xxxx@media.org>
Date: Fri, 26 Sep 2008 15:54:41 -0700

So, I haven't heard from any of our metadata dudes, so let me suggestthe following plan to move forward:

1. I am running an SSN scanner on the data ... that will give me abunch of hits that I have to look at by hand, so it will take me awhile to scrub it.

2. When the scan is done, I propose we announce the initial progress(25% (or more) of PACER created by the pacer p3wner posse). I willannounce that as an "alpha" release.

3. You are free to keep accumulating data ... I want to be > 50% bythe end of the year.

4. If/when we get metadata advice on how to do the XMP headers using acustom rdf schema, I'll stamp the existing archive, and call that beta.

5. Once we have a unique id stamp, we can start pulling in other pacercollections, such as my recycling stuff and Tim Stanley's proxy data(he uses client pacer id's, bills them for the work, then (with theirpermission) keeps a copy of the files).

I don't want to merge collections unless we can, using a script, beable to tell if file a (e.g., document x of docket foo) is the same asanother file (e.g., document y of somebody else's version of docketfoo).

Does this sound like a plan? Our biggest PR boost will be thepercentage of PACER we've retrieved. So, if you want to create apacer2 area and fill it, go for it.

One last item ... I plan on giving you guys full credit for yourwork. So, if you don't want that, better speak up now. :))


Carl

Follow-Ups:
- Re: pacer
  - From: "Aaron Swartz" <xx@aaronsw.com>

Prev by Date: Re: wais dump?
Next by Date: Re: pacer
Previous by thread: Re: pacer
Next by thread: Re: pacer
Index(es):
- Date
- Thread