[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Message 00283: Re: gpo archive

either that or (probably more sensibly), strip out the html and decode &s

On Fri, Sep 19, 2008 at 10:23 PM, Aaron Swartz <me@aaronsw.com> wrote:
>>>> http://bulk.resource.org/gpo.gov/bills/108/h5352ih.txt
>>> Those documents are actually lame HTML, not text. (They're wrapped in
>>> a <pre> tag and &, <, and > are all quoted.)
>> we just shelve what they ship.  :)
> yeah, but you should serve them with content-type: text/html so they
> render correctly.