[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Message 00286: Re: gpo archive

so ....

you want

thumper.public.resource.org:/pro/bulk.resource.org/htdocs/ gpo.gov/.htaccess

to have the line

    AddType text/html txt

Is that right?


On Sep 19, 2008, at 7:38 PM, Aaron Swartz wrote:

I'm pretty sure their waisgate interface returns everything as html

On Fri, Sep 19, 2008 at 10:34 PM, Carl Malamud <xxxxxxx@media.org> wrote:
On Sep 19, 2008, at 7:23 PM, Aaron Swartz wrote:


Those documents are actually lame HTML, not text. (They're wrapped in
a <pre> tag and &, <, and > are all quoted.)

we just shelve what they ship.  :)

yeah, but you should serve them with content-type: text/html so they
render correctly.

er. I suppose I can do that if I knew exactly which directories should get that .htaccess mod to the normal handling for txt and which ones really are
ascii txt.

in our next stage of evolution, I hope to have people spending more time making the data better, but right now the focus is much more on proving the point and honing in. seriously ... you don't want my mirror of the waisgate
system, you want the raw data from gpo.

If someone wants to look at the txt files in gpo.gov/... and say which ones are really html, happy to through the right mime types back. Otherwise, we
assume it is a client issue.