Government report: Data mining doesn't work well
The most extensive government report to date on whether terrorists can be identified through data mining has yielded an important conclusion: It doesn't really work.
A National Research Council report, years in the making and scheduled to be released Tuesday, concludes that automated identification of terrorists through data mining or any other mechanism "is neither feasible as an objective nor desirable as a goal of technology development efforts." Inevitable false positives will result in "ordinary, law-abiding citizens and businesses" being incorrectly flagged as suspects.
The whopping 352-page report, called "Protecting Individual Privacy in the Struggle Against Terrorists," amounts to at least a partial repudiation of the Defense Department's controversial data-mining program called Total Information Awareness, which was limited by Congress in 2003.
But the ambition of the report's authors is far broader than just revisiting the problems of the TIA program and its successors. Instead, they aim to produce a scholarly evaluation of the current technologies that exist for data mining, their effectiveness, and how government agencies should use them to limit false positives--of the sort that can result in situations like heavily-armed SWAT teams raiding someone's home and shooting their dogs based on the false belief that they were part of a drug ring.
The report was written by a committee whose members include William Perry, a professor at Stanford University; Charles Vest, the former president of MIT; W. Earl Boebert, a retired senior scientist at Sandia National Laboratories; Cynthia Dwork of Microsoft Research; R. Gil Kerlikowske, Seattle's police chief; and Daryl Pregibon, a research scientist at Google.
They admit that far more Americans live their lives online, using everything from VoIP phones to Facebook to RFID tags in automobiles, than a decade ago, and the databases created by those activities are tempting targets for federal agencies. And they draw a distinction between subject-based data mining (starting with one individual and looking for connections) compared with pattern-based data mining (looking for anomalous activities that could show illegal activities).
But the authors conclude the type of data mining that government bureaucrats would like to do--perhaps inspired by watching too many episodes of the Fox series 24--can't work. "If it were possible to automatically find the digital tracks of terrorists and automatically monitor only the communications of terrorists, public policy choices in this domain would be much simpler. But it is not possible to do so."
A summary of the recommendations:
* U.S. government agencies should be required to follow a systematic process to evaluate the effectiveness, lawfulness, and consistency with U.S. values of every information-based program, whether classified or unclassified, for detecting and countering terrorists before it can be deployed, and periodically thereafter.
* Periodically after a program has been operationally deployed, and in particular before a program enters a new phase in its life cycle, policy makers should (carefully review) the program before allowing it to continue operations or to proceed to the next phase.
* To protect the privacy of innocent people, the research and development of any information-based counterterrorism program should be conducted with synthetic population data... At all stages of a phased deployment, data about individuals should be rigorously subjected to the full safeguards of the framework.
* Any information-based counterterrorism program of the U.S. government should be subjected to robust, independent oversight of the operations of that program, a part of which would entail a practice of using the same data mining technologies to "mine the miners and track the trackers."
* Counterterrorism programs should provide meaningful redress to any individuals inappropriately harmed by their operation.
* The U.S. government should periodically review the nation's laws, policies, and procedures that protect individuals' private information for relevance and effectiveness in light of changing technologies and circumstances. In particular, Congress should re-examine existing law to consider how privacy should be protected in the context of information-based programs (e.g., data mining) for counterterrorism.
By itself, of course, this is merely a report with non-binding recommendations that Congress and the executive branch could ignore. But NRC reports are not radical treatises written by an advocacy group; they tend to represent a working consensus of technologists and lawyers.
The great encryption debate of the 1990s was one example. The NRC's so-called CRISIS report on encryption in 1996 concluded export controls--that treated software like Web browsers and PGP as munitions--were a failure and should be relaxed. That eventually happened two years later.
Declan McCullagh, CNET News' chief political correspondent, chronicles the intersection of politics and technology. He has covered politics, technology, and Washington, D.C., for more than a decade, which has turned him into an iconoclast and a skeptic of anyone who says, "We oughta have a new federal law against this." E-mail Declan. 





That's an impressive list of educated folks that came to the obvious conclusion. Now, how do we fix the mess? Should we elect another president that thinks of the Constitution as a piece of toilet paper and driven to reignite hitler's ideals, or should we elect change?
Think about it, it's not a hard decision to make. War, terrorism, torture, Gitmo, tanking economy, tanking dollar, homelessness and unemployment still sounding good to you? Feeling safer yet? Where is OBL again?
Voting for mcSHAME is a treasonous act against America!
http://news.cnet.com/8301-13772_3-9879556-52.html?tag=commProfileMain;profileBot
Cnet should be more reticent and use prudent caution when communicating gov't propoganda. Lots and lots of people are still brainwashed into believing the gov't only tell the truth based on their public "education." Pouring more urine into the cesspool of gov't lies really does a disservice to society. For example, how's the bailout working out? Hmph.
Me thinks this isn't the only untruth from the Bushies which will later be debunked.
It's important for the National Academies to have an independent voice separate from the government. Compare for example the Institute of Medicine (part of the National Academies) and the NIH (a Federal agency.) So it's a disservice to label NRC's work as a "government report".
As an aside I found the references to the "shooting their dogs" incident are out of place. That incident had nothing to do with data mining or terrorism. It wasn't even a case of "false positive." Someone mailed a 30lbs of drugs to a house, so they raided the house.
* If the NAS/NRC was created by the government and gets its annual budget from the government, it's part of the government. It's certainly not a private-sector enterprise.
* The shooting the dogs incident absolutely was an example of a false positive; more investigation needed to be done to double-check that false positive. Didn't happen.
Why has it taken them so long to figure out such obvious results?
All that intellect and not one working on cures for cancer, diabetes or what ever ails you. . .
And specifically to respond to Dalkorian, Hitler was a SOCIALIST, just like Lenin, Stalin, Mao, Pol Pot and the rest of the rogues gallery of 20th century criminals. NAZI is a shortened version of National Socialist. If you liked Hitler, you'll LOVE Obama!
seriously data mining is a good idea. About 20% of terrorists are going to post a page on MySpace outlining their terrorist activities.
You are going to feel very stupid for not at least catching those people.
It wont' do anything for the other 80% but nobody said it would be the only tool.
with that said, I agree Bush treats the constitution like toilet paper and so do most americans, and if you are voting for Obama as an alternative, guess what, you are just vacilating between tweedle dee and tweedle dum.
The only solution would be to actually do something different, Bob Barr comes to mind.
But as you said, those public school educations work wonders...you believe you can only vacilate from Repub. to Democrat year after year, so its Carter...then Reagan....Bush...then Clinton...Bush again, then Obama....
that isn't real change.
You just have to trace it back to greed and power. Everything evil in the world traces back to those two things. Some people say religion too, but usually that's just when someone in the building wants money or power. So, yeah. It's always been that way.
Anyway, going through everyone's information or even having it would just make me feel dirty and perverted so I've always thought maybe there's that aspect too. Maybe they get off on it in some weird way because I just don't know anyone could even go through with that kind of thing.
- by m_onger October 10, 2008 4:52 PM PDT
- Data mining is superb at finding facts that support a preconceived notion.
- Like this Reply to this comment
-
(15 Comments)At least that is my conclusion based on some experience. False positives, as McCullagh calls them, would be bad enough if they just occurred with statistical randomness, but they are catastrophic to victims when they can be concocted by a malevolent miner.
Data mining by the Government is part of a challenge to America's culture - do we bargain away our freedoms for illusions of a secure life?