Public.Resource.Org—2011 Annual Report

  1. House of Representatives. Our most successful program in 2011 was also our most frustrating. Based on a letter from the Speaker of the House we worked with the Committee on Oversight and Government Reform (OGR) to create House.Resource.Org. The system features bulk access via http, rsync, and ftp to hearings, which are also made available on the Internet Archive and on YouTube. With an agreement with OGR, we were able to get transcripts of current hearings and use those to create closed captions. In addition, a large number of hearings were procured from C-SPAN and the Select Committee on Energy Independence and Global Warming gave us their full archive.

    A large number of meetings were conducted with House staff in 2011 to try and bring this program to the next level. Due to some miscommunication, the House sent us a large part of their full archive by mistake which we released. A number of proposals were advanced that would have provided broadcast-quality multicast of all hearings on the Internet2 backbone, and a number of attempts were made to convince the House staffers to let us continue the process of putting the rest of the archive on-line. A full paper trail is available on this back-and-forth with staffers. Although we were not successful in getting approval to multicast all House channels, our efforts do appear to have prodded the House and the Library of Congress to create a new service that includes live webcasts of all hearings and high-quality archiving.

    We were pleased to help release over 14,500 hours of video from House hearings, some 41 terabytes of raw data. We were frustrated that House staff fought so hard to cut off release of the archive and against carrying these efforts to their logical conclusion, which is the release in a clueful format of all public hearings for use by the public.
  2. FedFlix. This program to make videos from the executive branch available had 12,018,444 views on the Internet Archive and 10,097,308 views on YouTube. Over 400 ContentID matches were received on the YouTube channel which we used as the basis for a Report to the Archivist which suggested that the National Archives and Records Administration should take over the FedFlix program. The Archivist met with us several times and has promised an answer early in 2012. In addition, a column by Cory Doctorow in the Guardian shed light on the ContentID issue and after a meeting with YouTube Legal we have been systematically challenging these purported copyright matches. To date, 177 of the initial 325 ContentID matches have been successfully removed.
  3. Public Safety Codes. Our version of California's Title 24 has undergone intensive work this year. There are 21 code revisions on the code repository. The Title 24 codes are now available online as basic HTML, recreated PDF documents (with pagination that matches the originals), and an HTML version with 897 of the 1,334 original images converted to SVG graphics. Heavy lifting to convert the graphics was by the Rural Design Cooperative, a Point.B Studio program that mentors students to teach them skills in subjects such as how to create vector graphics, how to code in MathML, and other advanced graphics training.

    One of the key issues in public safety codes is the question of Incorporation by Reference, where privately developed standards are incorporated into duly passed legislation or regulation. Carl participated as an appointed public member in the Administrative Conference of the United States which passed a recommendation on the subject that is, quite frankly, a step backward in public policy when it comes to availability of the law. We were pleased that the EFF drafted a forceful statement but it is clear that this subject is going to be a pressing public policy concern in the years to come. As Justice Stephen Breyer put it when he addressed the Administrative Conference, "if a law isn't public, it isn't a law."
  4. U.S. Court of Appeals. A strong effort was mounted to get lawyers to adopt a volume of federal opinions, complete with a full-page ad in the ABA Journal. Even though we were not successful in getting a mass outpouring of civic pride from the bar, we are very proud to have made available as double-keyed XHTML over 100 years of the opinions of the U.S. Court of Appeals, including all 30 volumes of the Federal Cases and 44 volumes of the Federal Reporter.

    After appearing before the 9th Circuit and meeting with the Historical Society, we scanned about 3.3 million pages of 9th Circuit briefs from 1890 to 1970. We now have a version of these documents in which each individual brief is pulled out into a PDF file and metadata is typed into a master spreadsheet. Our goal for Q1/2012 is to pull a large number of paper documents from the National Archives and use those to compare our electronic archive to the official paper records. This analysis will be submitted to the Archivist of the United States with a request that the electronic archive be certified as an Official Reference Copy, a step which would permit the Court, if it should so choose, to adopt the archive for their own use.
  5. National Reporter. One of our goals, a goal which we failed to achieve, was to create a national reporter system consisting of all current opinions of appellate and supreme courts at the federal and state level. Our initial strategy was a commercial arrangement with Fastcase, Inc. to provide a weekly feed, delayed by one week, of all opinions. We terminated this commercial arrangement in August. Since July, a similar service has been provided free of charge by Justia, Inc. While the Fastcase service included HTML versions of all the cases, the Justia service is PDF only. Perhaps the most promise in this area is a service created by CourtListener which provides a crawl of all federal appellate and supreme court opinions. Public.Resource.Org is helping to support the CourtListener system and will continue that support in 2012, hopefully with the active participation of the University of California at Berkeley.
  6. Mass Digitization. We participated in a number of activites to spur awareness of the need to greatly increase the scope and volume of federal digitization activities. Carl sites on the steering group of the Digital Public Library of America and gave a speech in Washington advocating a national digitization effort. Most recently, a public letter to the President was co-authored with John Podesta of the Center for American Progress coupled with a White House petition calling for the creation of a Federal Scanning Commission.
  7. Restrictions on Works of Government. A complaint entitled What Would Luther Burbank Do? was submitted to the Smithsonian asking for a reexamination of their intellectual property policies. The complaint featured a large number of products that used seed display catalogs posted by the Smithsonian, imagery that the Institution claimed required prior permission to use. Some of the protest activities included a press campaign, installation of large display posters at locations such as the Sonoma County government headquarters, a postcard campaign on Twitter featuring complaints from citizens such as Ralph Nader, and a reception at the New America Foundation featuring bottles of an open source protest beer. Members of Congress have been briefed on this dispute and the General Counsel of the Smithsonian Institution has let us know that the complaint is under consideration.
  8. Other Data. We provided a full mirror of Google's patent system, which in turn mirrors data from the U.S. Patent and Trademark Office. At this point, we're happy that the Google system is working well with the other free patent services in operation and we'll be decommissioning our mirror. As a side project, we are slowly mirroring the Library of Congress imagery database, a system that is approximately 10 tbytes of imatge files. A minor database we helped put online in 2011 are all the prior recommendations of the Administrative Conference of the U.S. Finally, we helped chair a series of meetings at the Center for American Progress on the topic "the future of FOIA" and provided a $15,000 grant to MySociety to support their efforts to localize WhatDoTheyKnow for U.S. FOIA applications. We also provided small grants in support of the IndiaKanoon.Org system for Indian caselaw and for RichmondSunlight.Com system which tracks the Virginia legislature. Finally, we provided a $60,000 grant to Princeton University in support of their privacy research with indeterminate results.
  9. Systems. Our systems went through a series of upgrades this year and now consists of one rack at a major San Francisco colocation point and another rack at ISC headquarters. All our systems are now on full A/B power (dual power supplies on each system, dual UPSs, dual PDUs, and generator backup). We have two 100-tbyte systems and 4 30-tbyte systems plus a handful of utility servers for DNS, mail, and related services. All our data is now on two or more systems. We've been able to retire all our Solaris and SPARC systems, and our core services of https/http/rsync/ftp are working on all our public systems. In addition to our hosting agreement with ISC, we have a 50-mbps allocation of bandwidth from Sebastopol through Sonic's network. Our goals for 2012 include better failover between live systems and backups, making sure our systems are all IPv6 accessible, rolling out DNSSEC, and continued security and operational audits. We believe we have sufficient disk and CPU capacity at this point for 2012.
  10. Corporate. Public.Resource.Org gratefully acknowledges a grant of $1,000,000 from Google.Org to support our Global Rulebook project in 2012, a project with an aim of putting all legal codes of the world online in a common format. We ended 2011 with a little over $1.3 million in the bank. We have retained our auditor for the 2011 audit, which is now underway, and will be using the same firm as last year for preparation of our 2011 tax returns. As always, we continue to observe all Best Current Practices for financial controls, conflicts of interest, and related party transactions. Our core operation consists of our Sebastopol headquarters sublet from O'Reilly Media and our hosting at the Internet Systems Consortium. Our two primary contractors are Mike D. Kail for systems support and Point.B Studio for general design and for support of our Title 24 efforts.

Thank you for your support. Last revised Sun Jan 1 15:08:24 PST 2012.