Josh Tauberer Gr’11 believed the government should just give him the data he needed to create GovTrack, his website to help people follow the progress of Congressional legislation. When the powers-that-be said No, he went out and got it anyway.

BY ALYSON KRUEGER | Photograph by Jim Graham

If you want to understand the 21st-century struggle to get the US government to release its data to the public, says Jim Harper, director of information policy studies at the libertarian Cato Institute, think about a newspaper.

Go to the weather page, and “you’ve got a lot of data,” he says. “The maps, and the charts, and things like that, that people use all the time to assess what is up with the weather.” Same for the financial section: “Data, data, data—really high numbers of facts per square inch, if you will.” And don’t even mention sports.

Then go to the national news. What do you see? Plenty of editorials and narratives rehashing what’s “happening” on Capitol Hill, but very little hard information. “Data is really about representing facts,” Harper says. “There isn’t much to do with ideology, and everybody agrees that there should be more data so there could be less ideology, frankly.”

One of the most prominent figures in this quest—Harper calls him “the guy” on open government data—is “civic hacker” Joshua Tauberer Gr’11. Tauberer is the author of Open Government Data: The Book (available for download at, a signatory to the “8 Principles of Open Data” created by 30 leading open government data advocates in 2007, and a lobbyist to—and sometime consultant for—both the House and the White House on transparency issues. But he is best known for being the developer and maintainer of the website, the go-to source for legislation-related information for journalists, lobbyists, and political activists of every persuasion, and even members of Congress and their staffs.

“It’s surprising that one guy with an interest in data was able to do as much as Josh has,” Harper adds. Tauberer’s success in collecting data scattered across multiple government websites and collating it into useful form (known as “screen scraping”), and his diligence in helping to pressure the government to make more of its data publicly available, has put him “at the leading edge of a massive change in the way government works,” Harper says.

You wouldn’t necessarily know it to look at him.

In appearance, Tauberer is a classic (and self-described) “geek”: scrawny figure, boyish face, formidable computer bag—easily the biggest in the atrium of the National Portrait Gallery in Washington, where we meet. He went to Princeton undergrad, majoring in psychology, and his Penn PhD is in linguistics. His speech is punctuated by long, considering pauses, and he expresses himself softly in thoughtful, well-constructed sentences. At 31 he still comes off as shy and modest: a mention of a New York Times story about him is quickly followed by, “but I’m sure nobody read it …”

All the same, while he may not be a “torches and pitchforks” kind of guy—as fellow open-data advocate Daniel Schuman, policy director for Citizens for Responsibility and Ethics in Washington, puts it—Tauberer has been at the forefront of a decade-long initiative to get the United States government to release information about how it works and what it does.

Tauberer calls this “the application of Big Data to civics.”

He was still in college when he decided he wanted to create a website that tracked every move Congress made—when it introduced a new bill, when that bill entered and left committee, how the bill was altered, when a vote occurred, and so on. But first he had to get the information from the government, a process that proved much harder than he expected. The problem is that, while the government does technically release that information to the public, it does so in a scrambled form that neither humans nor computers can easily put together to use.

To get the information he needed for his website Tauberer started hacking into government data. He didn’t just keep what he found for himself but released it for use by any other programmer who wanted to build their own website or app.

In this context, “Hack doesn’t mean anything related to cyber-terrorism,” Tauberer insists. “Since I studied linguistics, I can say from an educated position that it is a different word from the word that is used for all the other stuff.” (This is probably as close as he comes to using his Penn degree professionally, by the way.)

Along with a community of like-minded data-lovers, Tauberer also started lobbying the government to change the way it releases information about itself.

GovTrack currently receives 30,000 visitors a day. The website is so successful that it is almost always the first item that comes up during a Google search on legislation or congressional activities. Government bodies, including the House of Representatives and The White House, are working towards implementing Tauberer’s suggestions on improving transparency. His work has also been featured in The New York Times and Forbes magazine, which in December 2011 named Tauberer one of the most influential people under the age of 30 in the law and policy arena.

