Politics and Law

Read all 'terrorism' posts in Politics and Law
October 7, 2008 9:30 AM PDT

Government report: Data mining doesn't work well

by Declan McCullagh
  • 15 comments

The most extensive government report to date on whether terrorists can be identified through data mining has yielded an important conclusion: It doesn't really work.

A National Research Council report, years in the making and scheduled to be released Tuesday, concludes that automated identification of terrorists through data mining or any other mechanism "is neither feasible as an objective nor desirable as a goal of technology development efforts." Inevitable false positives will result in "ordinary, law-abiding citizens and businesses" being incorrectly flagged as suspects.

The whopping 352-page report, called "Protecting Individual Privacy in the Struggle Against Terrorists," amounts to at least a partial repudiation of the Defense Department's controversial data-mining program called Total Information Awareness, which was limited by Congress in 2003.

But the ambition of the report's authors is far broader than just revisiting the problems of the TIA program and its successors. Instead, they aim to produce a scholarly evaluation of the current technologies that exist for data mining, their effectiveness, and how government agencies should use them to limit false positives--of the sort that can result in situations like heavily-armed SWAT teams raiding someone's home and shooting their dogs based on the false belief that they were part of a drug ring.

The report was written by a committee whose members include William Perry, a professor at Stanford University; Charles Vest, the former president of MIT; W. Earl Boebert, a retired senior scientist at Sandia National Laboratories; Cynthia Dwork of Microsoft Research; R. Gil Kerlikowske, Seattle's police chief; and Daryl Pregibon, a research scientist at Google.

They admit that far more Americans live their lives online, using everything from VoIP phones to Facebook to RFID tags in automobiles, than a decade ago, and the databases created by those activities are tempting targets for federal agencies. And they draw a distinction between subject-based data mining (starting with one individual and looking for connections) compared with pattern-based data mining (looking for anomalous activities that could show illegal activities).

But the authors conclude the type of data mining that government bureaucrats would like to do--perhaps inspired by watching too many episodes of the Fox series 24--can't work. "If it were possible to automatically find the digital tracks of terrorists and automatically monitor only the communications of terrorists, public policy choices in this domain would be much simpler. But it is not possible to do so."

A summary of the recommendations:

* U.S. government agencies should be required to follow a systematic process to evaluate the effectiveness, lawfulness, and consistency with U.S. values of every information-based program, whether classified or unclassified, for detecting and countering terrorists before it can be deployed, and periodically thereafter.

* Periodically after a program has been operationally deployed, and in particular before a program enters a new phase in its life cycle, policy makers should (carefully review) the program before allowing it to continue operations or to proceed to the next phase.

* To protect the privacy of innocent people, the research and development of any information-based counterterrorism program should be conducted with synthetic population data... At all stages of a phased deployment, data about individuals should be rigorously subjected to the full safeguards of the framework.

* Any information-based counterterrorism program of the U.S. government should be subjected to robust, independent oversight of the operations of that program, a part of which would entail a practice of using the same data mining technologies to "mine the miners and track the trackers."

* Counterterrorism programs should provide meaningful redress to any individuals inappropriately harmed by their operation.

* The U.S. government should periodically review the nation's laws, policies, and procedures that protect individuals' private information for relevance and effectiveness in light of changing technologies and circumstances. In particular, Congress should re-examine existing law to consider how privacy should be protected in the context of information-based programs (e.g., data mining) for counterterrorism.

By itself, of course, this is merely a report with non-binding recommendations that Congress and the executive branch could ignore. But NRC reports are not radical treatises written by an advocacy group; they tend to represent a working consensus of technologists and lawyers.

The great encryption debate of the 1990s was one example. The NRC's so-called CRISIS report on encryption in 1996 concluded export controls--that treated software like Web browsers and PGP as munitions--were a failure and should be relaxed. That eventually happened two years later.

August 22, 2008 2:11 PM PDT

Fatal flaws found in terrorism database

by Stephanie Condon
  • 26 comments

One of the country's most important terrorism databases is on the verge of failure after suffering from gross mismanagement and technical design flaws that went ignored for months, a congressional investigation found.

A congressional committee on Thursday called for an investigation into a program called "Railhead," which was supposed to upgrade the National Counterterrorism Center's integrated terrorist intelligence database, called Terrorist Identities Datamart Environment (TIDE). The database serves the United States' 16 separate intelligence agencies, and as of January, contained more than 500,000 names (PDF), according to the NCTC. The program has cost an estimated $500 million.

Railhead was also meant to improve TIDE Online, an unclassified version of the TIDE database, and NCTC Online, a classified database of terrorist information and intelligence reports available to counterterrorism analysts.

However, officials at the NCTC began making drastic changes to the Railhead program in recent weeks, according to the House Science and Technology Committee, including laying off hundreds of private contractors working on the program. The number of contractors has shrunk from more than 800 to just a few dozen. The state of the program is now in jeopardy.

Representative Brad Miller, chairman of the House Science and Technology Committee's Investigations and Oversight Subcommittee, sent a letter (PDF) Thursday to the Inspector General of the Office of the Director of National Intelligence requesting an investigation into Railhead's near-collapse.

"Potentially hundreds of millions of dollars have been wasted, delivery schedules have slipped, contractor employees have been laid off," he wrote. "The end result is a current IT system used to identify terrorist threats that has been crippled by technical flaws and a new system that if actually deployed will leave our country more vulnerable than the existing yet flawed system in operation today."

Miller noted the problems with TIDE and Railhead stem from "fundamental design flaws," namely their reliance on Structured Query Language (SQL) to search the database. SQL is a computer code that uses sentence structures to conduct queries, as opposed to using text-based searches, like search engines such as Google do.

Due to faulty searches, tens of thousands of CIA messages to the NCTC have not been properly processed or reviewed, or may not have even reached the TIDE database.

On top of that, the TIDE database has reportedly crashed several times in recent months, delaying the delivery of updated terrorist intelligence data to the FBI's Terrorist Screening Center.

While TIDE already has problems, Railhead appears to just exacerbate them: The Railhead initiative would significantly downgrade the NCTC Online's capabilities by preventing access to any intelligence community Web sites or data resources, such as sites for the CIA, DIA, or FBI.

The project is not only flawed but also behind schedule. Thirty-four of Railhead's 72 "action items" are past due, and two are behind schedule. Ten more tasks--five of them costing more than $92 million--are "significantly off-task."

Unnamed sources involved with the Railhead project also told Congress that some of the project's deals with private contractors were inappropriate. A memo (PDF) produced by congressional staff cites sources who allege that SRI International's involvement in the project created a conflict of interest because SRI program director Earl Lyberger has close ties to Railhead's program manager Dirk Rankin.

Additionally, the staff's sources allege that the government misused funds by spending nearly $200 million to retrofit a building in Herndon, Va., belonging to one of the project's main contractors, Boeing.

Representatives from Boeing and SRI did not respond to requests for comments.

Miller noted in his request for an investigation into the program that there may be efforts under way to close down Railhead completely.

  • prev
  • 1
  • next
advertisement

15 sites that went kaput in 2009

Web sites launch all the time, but they also shut their doors. We highlight 15 that bit the dust this year.

Top 10 news stories of the decade

Let the debate begin: Was the iPhone more important than iTunes? Was anything bigger than Google finding a great business model? CNET offers its list of the 10 most important stories of the '00s.

About Politics and Law

News at the intersection of technology, politics, and law, ranging from intellectual property to censorship to tech policy.

Add this feed to your online news reader

Politics and Law topics

Most Discussed



advertisement

Inside CNET News

Scroll Left Scroll Right