• On The Insider: Britney's Bikini-Clad Top 10
January 27, 2009 7:06 AM PST

Activists call for a mashup-friendly Recovery.gov

by Chris Soghoian

As President Obama's $825+ billion financial stimulus package works its way through Congress, a number of groups have started to call for increased transparency in the way that data on the proposed spending will be shared with citizens.

Most noteworthy are demands from public-interest groups and academics that the the data be provided in a format conducive to user-generated mashups and remixes.

The American Recovery and Reinvestment Act of 2009 passed through the House Appropriations Committee a couple weeks ago, and it is expected to come up for a full House vote in the coming weeks.

In addition to authorizing the spending of an obscene amount of money, the act also mandates the creation of a Web site to "foster greater accountability and transparency" in the use of those funds.

While the bill does a great job in mandating the kinds of information that will be put online (contracts, audits, inspector general reports, etc.), it is rather vague with regard to details on how the information will be provided.

The only hints include language mandating that the information be "easy to understand" and "regularly updated," and include a "database of findings from audits," "printable reports," and "user-friendly visual presentations to enhance public awareness of the use of funds."

Such statements bring to mind the possibility of yet another boring and difficult-to-navigate federal government Web site, perhaps similar to the Federal Communications Commission's antiquated and ineffective home page, or the Federal Elections Commission's slothlike campaign donation search engine.

Faced with the possibility of another Web 1.0 Web site designed by the federal bureaucracy, a number of pro-transparency activists and tech policy academics have started to weigh in on the issue, all of them demanding the same thing: full, easy, and free access to the complete data set powering the Recovery.gov Web site.

For example, while the FEC's donation search engine was often slow and unresponsive during last year's presidential campaign, a number of third parties were able to create fantastic mashups of the campaign donation data--the most notable of these being the Hufington Post's FundRace tool, which provides users with a Google map view of each donation to the presidential campaigns.

The numerous independent sites allowing for the easy navigation of campaign donation data was possible because of the legal requirement that all FEC data be made available in full to the public. As a result, public-interest groups and media organizations were able to create their own innovative mashups and remixes of the data, providing faster and more responsive Web interfaces than the FEC's overwhelmed servers, as well as creating innovative visualization methods for navigating the data set.

John Wonderlich, program director at the nonpartisan Sunlight Foundation, outlined the general problem:

We'd like the site to serve not just the amateur information consumer, but also the programmers that can skillfully remix the information. The citizen observer's role seems well-addressed by the legislation that mandated the site (with requirements for "printable reports," feedback, and to be "easy to understand"), while the needs of the programmer are largely unaddressed. The data should be available in formats that facilitate more advanced use by programmers and analysts alike.

Certainly, the data should be made available following the 8 Principles of Open Data: (1) complete, (2) primary (as it is collected at the source), (3) timely, (4) accessible, (5) machine-processable, (6) nondiscriminatory, (7) nonproprietary, and (8) and license-free. XML and CSV are a minimum.

Search is great, if you are looking to find information about any one thing. But original analysis and visualization require access to data in bulk. If the goal of putting the data online is to increase accountability and transparency, then it is necessary (to) provide bulk data access.

Echoing this last point, David Robinson, the associate director of the Center for Information Technology Policy at Princeton University, told me that "(no) one person or organization could possibly anticipate all the ways that Americans will want to analyze, reuse, or cross-reference the information that Recovery.gov will offer. And no one person or organization needs to do so, as long as the data itself is readily available."

In 2008, Robinson and his colleagues at Princeton published a paper calling for the government to provide open access to the raw data used by all federal Web sites. The highly influential paper has been widely circulated among technology policy circles in recent months.

Jim Harper, the director of information policy studies at the Cato Institute, feels that the entire back-end database should be made available.

"This is a little tricky, because people have to settle on a format, and then require submissions in that format from contractors and state and local entities, etc.," Harper told me. "But if the administration wants to be transparent, a little forcing will go a long way. States and contractors will learn how to deal with standardized data quickly, if it makes the difference on getting federal dollars."

A month ago, Harper moderated a one-day forum at Cato, in which a number of policy experts called for open access to government data. A video and podcast of that event can be found here.

Given that this bill has largely been written and shaped behind closed doors, it remains unclear how much of an impact these pro-transparency activists will have on the legislation that will create the Recovery.gov Web site. As of press time, calls for comment left with the House and Senate Appropriations Committees had yet to be returned.

Christopher Soghoian delves into the areas of security, privacy, technology policy and cyber-law. He is a student fellow at Harvard University's Berkman Center for Internet and Society , and is a PhD candidate at Indiana University's School of Informatics. His academic work and contact information can be found by visiting www.dubfire.net/chris/. He is a member of the CNET Blog Network, and is not an employee of CNET. Disclosure.
Recent posts from Surveillance State
YouTube's new 'nocookie' feature continues to serve cookies
Is the White House changing its YouTube tune?
Recovery.gov blocked search engine tracking
Obama's BlackBerry brings personal safety risks
White House expands use of search-blocking code
Activists call for a mashup-friendly Recovery.gov
White House yanks 'YouTube' from privacy policy
White House acts to limit YouTube cookie tracking
Add a Comment (Log in or register) (6 Comments)
  • prev
  • 1
  • next
by smutticus_maximus January 27, 2009 8:30 AM PST
They need to start putting all government government documents into a version tracking system akin to CVS or SVN.

I want to be able to check out a bill in committee and see who added which line items to it. If each congress critter had their own ID than I would be able to trace each critter's bacon. This would be the ultimate in accountability and would allow for all kinds of great mashups. Imagine being able to isolate all contributions from one congressperson and then track them through the committee process. There must already be some sort of informal way the control document versions. But in this day and age there is no reason why this can't be more formalized.

The technical side of this problem has already been solved. But I can imagine lawmakers would be absolutely scared stiff by this idea. All the more reason to do it.
Reply to this comment
by xcal78 January 27, 2009 9:23 AM PST
The question is what value do that provide? Who's paying to provide the data at that level? It's one thing to have access to data it's entirely another thing to have this kind of detailed breakdown on the data. Most companies don't even go that far for obvious financial reasons so why would you even think the government would do what bilion dollar companies don't even do? I'm sure everyone would like this but it's just not feasable right now. Just getting basic access to the data is step one. We can revist this in 10 years when we get to step 2.
by xcal78 January 27, 2009 9:27 AM PST
I've deployed 2 different PDM systems at previous companies and couldn't see the government getting one due to the cost and resources required to maintain and deploy that type of solution.
by frank_tom January 27, 2009 9:35 AM PST
A Dozen Fun Facts About the House Democrats' Massive Spending Bill

1. The House Democrats' bill will cost each and every household $6,700 additional debt, paid for by our children and grandchildren.

2. The total cost of this one piece of legislation is almost as much as the annual discretionary budget for the entire federal government.

3. President-elect Obama has said that his proposed stimulus legislation will create or save three million jobs. This means that this legislation will spend about $275,000 per job. The average household income in the U.S. is $50,000 a year.

4. The House Democrats' bill provides enough spending - $825 billion - to give every man, woman, and child in America $2,700.

5. $825 billion is enough to give every person living in poverty in the U.S. $22,000.

6. $825 billion is enough to give every person in Ohio $72,000.

7. Although the House Democrats' proposal has been billed as a transportation and infrastructure investment package, in actuality only $30 billion of the bill - or three percent - is for road and highway spending. A recent study from the Congressional Budget Office said that only 25 percent of infrastructure dollars can be spent in the first year, making the one year total less than $7 billion for infrastructure.

8. Much of the funding within the House Democrats' proposal will go to programs that already have large, unexpended balances. For example, the bill provides $1 billion for Community Development Block Grants (CDBG), which already have $16 billion on hand. And, this year, Congress has plans to rescind $9 billion in highway funding that the states have not yet used.

9. In 1993, the unemployment rate was virtually the same as the rate today (around seven percent). Yet, then-President Clinton's proposed stimulus legislation ONLY contained $16 billion in spending.

10. Here are just a few of the programs and projects that have been included in the House Democrats' proposal:

* $650 million for digital TV coupons.
* $6 billion for colleges/universities - many which have billion dollar endowments.
* $166 billion in direct aid to states - many of which have failed to budget wisely.
* $50 million in funding for the National Endowment of the Arts.
* $44 million for repairs to U.S. Department of Agriculture headquarters.
* $200 million for the National Mall, including grass planting.
* $400 million for "National Treasures."

11. Almost one-third of the so called tax relief in the House Democrats' bill is spending in disguise, meaning that true tax relief makes up only 24 percent of the total package - not the 40 percent that President-elect Obama had requested.

12. $825 billion is just the beginning - many Capitol Hill Democrats want to spend even more taxpayer dollars on their "stimulus" plan.
Reply to this comment
by xcal78 January 28, 2009 9:50 AM PST
Good thing Bush was in office when this happened. We know who to blame!
by xcal78 January 28, 2009 9:56 AM PST
George W. Bush did a great job with:

"The Iraq War Will Cost Us $3 Trillion, and Much More"
http://www.washingtonpost.com/wp-dyn/content/article/2008/03/07/AR2008030702846_pf.html
Reply to this comment
(6 Comments)
  • prev
  • 1
  • next
advertisement

After 5 years, Firefox faces new challenges

Mozilla helped reshape the Web since releasing Firefox 1.0 five years ago. Now it's got a reawakened Microsoft and Google Chrome to reckon with.

There's a map for that: GPS or smartphone?

Almost every handset comes with mapping software these days, but standalone GPS devices are becoming more affordable than ever.

advertisement

About Surveillance State

Christopher Soghoian delves into the areas of security, privacy, technology policy and cyber-law. He is a student fellow at Harvard University's Berkman Center for Internet and Society, and is a PhD candidate at Indiana University's School of Informatics. His academic work and contact information can be found by visiting www.dubfire.net/chris/. He is a member of the CNET Blog Network and is not an employee of CNET. Disclosure.

Add this feed to your online news reader

Surveillance State topics

advertisement
advertisement

Inside CNET News

Scroll Left Scroll Right