March 13, 2007

The Honorable Nancy Pelosi
Speaker of the House
U.S. House of Representatives
Washington, D.C. 20515

Via email.

Dear Speaker Pelosi:

I write to you today with the results of two years of research on the subject of the creation of broadcast-quality video of congressional proceedings for download on the Internet. My conclusion can be summarized as follows:

By the end of the 110th Congress, the U.S. House of Representatives could achieve the goal of providing broadcast-quality video of all hearings and the floor for download on the Internet.

In early 2005, when I joined the Center for American Progress as a Senior Fellow and Chief Technology Officer, John Podesta asked me to look into the question of the video record of the U.S. Congress. This issue was a natural one for both of us. John, of course, served many years in senior staff positions in the Senate, and has always had a strong interest in technology policy. When I created the first radio station on the Internet, one of my first actions was to apply for broadcast credentials with the U.S. Congress, where I was able to create the first webcasts from the floor of the House and Senate starting with Bill Clinton's State of the Union Address in 1995. I was also able to work with the Joint Economic Committee later that year to do the first on-line congressional hearings.

01  http://museum.media.org/radio/

02  http://town.hall.org/radio/JEC

My investigation into the question of whether broadcast-quality video from all congressional hearings can be made available on the Internet for download was conducted in two stages. In stage 1, I conducted an extensive feasibility analysis, examining all the technical and financial issues involved. In stage 2, I created a proof-of-concept prototype to get hands-on experience with the current congressional webcasts and to demonstrate how one might apply more systematic techniques towards creation of an archive.

Stage 1: Feasibility Analysis

There are two ways that video is produced for a congressional hearing:

A member of the media obtains credentials from the United States Senate Radio-Television Gallery, which manages the credentials-granting process on behalf of the United States House of Representatives Radio-Television Gallery. While there are issues with the composition of the gallery, in particular the ban on “new media,” membership applications enforced by congressional employees acting to shield the “old media” Executive Committee from competition, I believe focusing on the media is to lose focus on the needs of the public and the responsibilities of the U.S. Congress to provide a complete record.

The media, by definition, will choose what they want to cover. On any given day, there may be a dozen simultaneous hearings in the U.S. House of Representatives, and even with a greatly expanded gallery membership, it is unlikely that the media would choose to provide a comprehensive video record of all hearings. However, just because an event is not worthy of the evening news does not mean it should be omitted from the public record or not made available on the Internet to reach the “long tail” it provides.

My investigation thus focused primarily on what it would take for the U.S. Congress to directly provide this permanent video record in the form of broadcast-quality video for download on the Internet.

The term “broadcast-quality” is subject to interpretation, and I believe some flexibility should be maintained in specifying this standard, leaving individual committees to determine the parameters based on operational considerations. However, for purposes of illustration and based on my own personal experience, I would consider broadcast-quality to be an 8 mbps stream of H.264-encoded video using the NTSC standard with a 48-Khz stereo audio stream, delivered in an MPEG4 format. This is approximately 3.6 gigabytes per hour of video. While this may sound large, there are numerous sites on the Internet working with this quality of data or even significantly better standards such as 1080i or 720p High-Definition video.

In the course of my investigation of this topic, I held numerous discussions with senior scientists who are experts in computer networking and video encoding and transmission. I discussed the matter in depth with teams from Cisco, Google, and Sun Microsystems, and brought the matter up directly with protocol architects who are the authors of the standards that are used today for “webcasting” and other live and streaming media services.

I also spent considerable time examining the physical infrastructure on Capitol Hill and in the Washington, D.C. area. I met with technical staff from Verizon, Switch and Data, and Equinix. I discussed the matter with routing experts who are responsible for much of the Internet infrastructure that is used to connect the Washington area to both commercial networking fabric and to the Research and Education networks such as Internet2 and National LambdaRail. I met repeatedly with the Chief Information Officer and other officials from Georgetown University and other area institutions that maintain large network presences.

In addition to these numerous technical consultations, I met with officials from the Government Printing Office, who said such a goal was very much in the spirit of their own Strategic Vision for the 21st Century. In a letter to me on July 27, 2006, Bruce R. James, the Public Printer of the United States, was very supportive of the concept of “live and permanently archived video from Congressional and Executive branch public proceedings on the Internet.” Mr. James encouraged me to continue my consultations and to keep his Chief Technology Officer and the Superintendent of Documents informed as to my conclusions.

Finally, I met numerous times with staff members of the Gallery and with staff from many committees to learn of progress on technical issues such as the wiring of committee rooms and the installation of cameras. From first-hand experience, I understand how difficult it is to wire these historic facilities. I was very pleased to learn of your recent efforts to accelerate the pace of these efforts, and my discussions with your staff lead me to believe that the pace is such that there is no doubt that every committee room will have cameras in the next two years.

The physical infrastructure will thus soon be in place to provide systematic, comprehensive coverage of each hearing. I will not discuss all of those details in this letter, but if anybody is interested there is a technical lecture on-line I gave on the topic to the Google engineering staff as part of their “tech-talk” series:

03  http://video.google.com/videoplay?docid=2633159172413478267

While the technical issues are complex, the bottom line is very simple. I have absolutely no doubt that it is technically and financially feasible within 2 years for the U.S. Congress to provide broadcast-quality video from every hearing and from the floor for download on the Internet.

Just because something is feasible does not mean that it is desirable. Providing this data is a policy choice not a technical question. I thus turn now to the question of what such a video record might mean.

Stage 2: Proof of Concept Prototype

For two weeks starting February 26, 2007, I set about systematically “ripping” webcast video streams from congressional committees, transcoding the data, and uploading the hearings to the Internet Archive and to Google Video. (“Ripping” is the process of converting something that is only available as a webcast stream into a file. “Transcoding” is converting the format of the video.) I picked the Internet Archive and Google Video as examples of one non-profit and one for-profit entity, but they are simply examples. There are literally hundreds of services—from YouTube to universities and everything in-between—that would be able to use this data.

The question I asked myself in creating these hearings on two different video upload sites was whether there is added value in having many different ways of looking at the U.S. Congress, or whether the current system of depending solely on the committees' webcast and web presence is sufficient. In other words, should the U.S. Congress be the sole “retail” provider of a presence to people through house.gov, or would it be better if the U.S. Congress also played a “wholesale” role by providing bulk data to be processed by other systems? Note that the terms “retail” and “wholesale” are used to differentiate between direct service to end users and bulk distribution of raw data, and do not imply any financial transactions. All U.S. government data is, of course, in the public domain and available at no charge.

In the course of two weeks of processing congressional hearings, I was able to download, transcode, and upload 63 hearings for a total of 160 hours and 43 minutes of video. You may see the Internet Archive version of this video here:

04  http://www.archive.org/search.php?query=hooptedoodle

You will notice that this particular service has some features that are not available on congressional web sites. For example, users are able to annotate hearings with ratings and reviews. The site allows users to both download and stream video, and the video has been converted to a non-proprietary standard that works across all different operating systems and players. And, because I am drawing on hearings from several different committees, it offers a degree of unification not available from the house.gov sites, which are all administered seperately.

In addition, users can easily navigate the archive by a series of keywords. For example, one can search for all hearings from a particular committee, such as the Committee on Appropriations:

05  http://www.archive.org/search.php?query=appropriations.house.gov

The Committee on Appropriations illustrates the importance of maintaining an archive. Appropriations, along with Foreign Affairs, Ways and Means and several other committees have a current policy of offering a live webcast, but not maintaining an archive of their proceedings. This means that the prototype that I have built contains the only video archive on the Internet for these committees, making me the point of origination for the data. I do not believe that it is appropriate for a private citizen such as myself to be the point of origination for the public video record of these committees.

Another issue that I uncovered in the course of transcoding this video was that all the committees provide video encoded with proprietary technologies. Indeed, several commiteees go so far as to say that the use of Microsoft software is “required” to interact with the committee and provide the company with free ads with links into the Microsoft site to get products. Even if the committee chooses to publish their video in a proprietary format, it is simply not correct technically to say that this video must be viewed on Microsoft's products: these files could just as easily be played using open source tools such as VLC media player, Democracy Player, or MPlayer. Any member of the public should be able to choose to use Microsoft software, but to maintain that the software is “required” is a commercial endorsement by the U.S. Congress that has monetary value to the company previously mentioned.

In the course of building this prototype, I was able to examine the video quality of the webcasts of a large number of committees. The quality of the webcasts ranges from barely acceptable to very, very bad. In some cases, the committees provide a video picture of 160 by 120 pixels at 10 frames per second, video quality that is significantly worse than that which I used in 1993 when I was doing some of the first webcasts on the Internet. (I should also note that the process of ripping and then transcoding further degrades quality, an issue that did not concern me in building the proof of concept prototype and which would not occur if the full-resolution files were available for download.) Even the best webcasts are not of adequate quality to be of use in video production, even at the consumer level with software such as iMovie, let alone for a film or news production.

The video cameras installed in the Capitol are very good and there are no technical obstacles to offering better quality. It is simply a matter of a few more bits. Files can easily be uploaded to an FTP server where the miracle of Internet distribution will move the data out into the rest of the world, so there are no capacity issues to worry about. Providing bigger files and distributing them on the net is a question of policy, not a technical issue.

To illustrate the difference better quality makes, I have posted two versions of a hearing by the Committee on House Administration, which maintains one of the better web sites in house.gov. The first URL below is a transcoding of the version offered by the Committee from their web site, transformed into a downloadable file, the procedure I used on all 63 hearings. The second URL is taken directly from a DVD which I bought from C-SPAN and uploaded with their permission:

06  http://www.archive.org/details/gov.house.cha.05252006-final

07  http://www.archive.org/details/gov.house.cha.05252006-dvd

As can be seen, the C-SPAN version offers the viewer significantly better quality options. A professional filmmaker or a news organization could download the full resolution version of the file, which is quite large. A YouTuber or other mashup artist could download something smaller, and the casual viewer could simply watch a version directly on the site.

If one doesn't like the “look and feel” of the Internet Archive, the whole point of making the data available in bulk is that there will be options. Here is that same hearing on Google Video:

08  http://video.google.com/videoplay?docid=1058295326420473490

Notice that the presentation is very different. Here, anybody can add a tag to the data or annotate, not just the document creator. The site includes the ability to download a version preformatted for the iPod. It thus becomes a matter of personal preference for members of the public to decide how best to consume their data.

Of the 63 hearings that were uploaded to the Internet Archive, 36 were uploaded to Google Video as a sanity check. Here is the collection on that service:

09  http://video.google.com/videosearch?q=hooptedoodle

As a further proof of concept, I took the hearing from C-SPAN, available under their new license policy, and uploaded it to Azureus, a service based on the BitTorrent peer-to-peer protocol:

10  http://www.zudeo.com/az-web/search?q=hooptedoodle

Conclusions

I offer my prototypes not as an illustration of what I think a congressional web site should look like, or even a model of what a fully-developed external site might look like, but simply to illustrate why the U.S. Congress should take advantage of the “network effect” of having data available in bulk for others to work with. It is, of course, vital that the U.S. Congress be the official source of all this data, offering a permanent, public archive of proceedings and hearings. Providing broadcast-quality video from all congressional hearings for download on the Internet will encourage the dissemination of this data and the creation of new and exciting ways for the public to interact with their elected representatives.

Based on 25 years of experience in the field of computer networking and a 2-year investigation of this specific issue, I have absolutely no doubt that it is technically and financially feasible for the U.S. Congress to provide a permanent broadcast-quality video record of proceedings and hearings for download on the Internet. Technically speaking, this is a “no-brainer.” This is simply a matter of will.

Adopting a goal that by the end of the 110th Congress, the U.S. House of Representatives will offer broadcast-quality video of the floor and all hearings for download on the Internet is a reachable goal and one that will set a standard for transparency and openness. Your leadership in embracing this goal would set an example for all branches of the federal government, indeed for all governments here and abroad. If a hearing is to be considered truly public, the public has to be able to see it, both now and forever. I encourage you to adopt this standard as a goal for the 110th Congress.

Respectfully yours,

Carl Malamud

cc:

The Honorable Juanita Millender-McDonald, Chairwoman
Committee on House Administration

The Honorable David R. Obey, Chairman
Committee on Appropriations

The Honorable Judith C. Russell
Superintendent of Documents

The Honorable John D. Podesta, President and CEO
Center for American Progress

Valid XHTML 1.0 Strict