[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[infowar.de] NSA Looks for More Ways to Mine Data

To: infowar - de -! - infopeace - de
Subject: [infowar.de] NSA Looks for More Ways to Mine Data
From: Ralf Bendrath <bendrath -! - zedat - fu-berlin - de>
Date: Wed, 01 Mar 2006 11:29:40 +0100
Mailing-list: contact infowar - de-help -! - infopeace - de; run by ezmlm

<http://www.nytimes.com/2006/02/25/technology/25data.html?%20ei=5058&en=cbcd71b864c16e09&ex=1141534800&partner=IWON&pagewanted=all>

NYT, February 25, 2006

Taking Spying to Higher Level, Agencies Look for More Ways to Mine Data
By JOHN MARKOFF

PALO ALTO, Calif., Feb. 23 — A small group of National Security Agencyofficials slipped into Silicon Valley on one of the agency's periodictechnology shopping expeditions this month.

On the wish list, according to several venture capitalists who met withthe officials, were an array of technologies that underlie the fiercedebate over the Bush administration's anti-terrorist eavesdroppingprogram: computerized systems that reveal connections between seeminglyinnocuous and unrelated pieces of information.

The tools they were looking for are new, but their application would fallunder the well-established practice of data mining: using mathematical andstatistical techniques to scan for hidden relationships in streams ofdigital data or large databases.

Supercomputer companies looking for commercial markets have used thepractice for decades. Now intelligence agencies, hardly newcomers to datamining, are using new technologies to take the practice to another level.

But by fundamentally changing the nature of surveillance, high-tech datamining raises privacy concerns that are only beginning to be debatedwidely. That is because to find illicit activities it is necessary to turnloose software sentinels to examine all digital behavior whether it isinnocent or not.

"The theory is that the automated tool that is conducting the search isnot violating the law," said Mark D. Rasch, the former head ofcomputer-crime investigations for the Justice Department and now thesenior vice president of Solutionary, a computer security company. But"anytime a tool or a human is looking at the content of yourcommunication, it invades your privacy."

When asked for comment about the meetings in Silicon Valley, Jane Hudgins,a National Security Agency spokeswoman, said, "We have no information toprovide."

Data mining is already being used in a diverse array of commercialapplications — whether by credit card companies detecting and stoppingfraud as it happens, or by insurance companies that predict health risks.As a result, millions of Americans have become enmeshed in a vast andgrowing data web that is constantly being examined by a legion ofInternet-era software snoops.

Technology industry executives and government officials said that theintelligence agency systems take such techniques further, applyingsoftware analysis tools now routinely used by law enforcement agencies toidentify criminal activities and political terrorist organizations thatwould otherwise be missed by human eavesdroppers.

One such tool is Analyst's Notebook, a crime investigation "spreadsheet"and visualization tool developed by i2 Inc., a software firm based inMcLean, Va.

The software, which ranges in price from as little as $3,000 for asheriff's department to millions of dollars for a large government agencylike the Federal Bureau of Investigation, allows investigators to organizeand view telephone and financial transaction records. It was used in 2001by Joyce Knowlton, an investigator at the Stillwater State CorrectionalFacility in Minnesota, to detect a prison drug-smuggling ring thatultimately implicated 30 offenders who were linked to Supreme White Power,a gang active in the prison.

Ms. Knowlton began her investigation by importing telephone call recordsinto her software and was immediately led to a pattern of calls betweenprisoners and a recent parolee. She overlaid the calling data with recordsof prisoners' financial accounts, and based on patterns that emerged, shebegan monitoring phone calls of particular inmates. That led her to codedmessages being exchanged in the calls that revealed that seeminglyinnocuous wood blocks were being used to smuggle drugs into the prison.

"Once we added the money and saw how it was flowing from addresses thatwere connected to phone numbers, it created a very clear picture of thesmuggling ring," she said.

Privacy, of course, is hardly an expectation for prisoners. And creditcard customers and insurance policyholders give up a certain amount ofprivacy to the issuers and carriers. It is the power of such softwaretools applied to broad, covert governmental uses that has led to thedeepening controversy over data mining.

In the wake of 9/11, the potential for mining immense databases of digitalinformation gave rise to a program called Total Information Awareness,developed by Adm. John M. Poindexter, the former national securityadviser, while he was a program manager at the Defense Advanced ResearchProjects Agency.

Although Congress abruptly canceled the program in October 2003, thelegislation provided a specific exemption for "processing, analysis andcollaboration tools for counterterrorism foreign intelligence."

At the time, Admiral Poindexter, who declined to be interviewed for thisarticle because he said he had knowledge of current classifiedintelligence activities, argued that his program had achieved a tenfoldincrease in the speed of the searching databases for foreign threats.

While agreeing that data mining has a tremendous power for fighting a newkind of warfare, John Arquilla, a professor of defense analysis at theNaval Postgraduate School in Monterey, Calif., said that intelligenceagencies had missed an opportunity by misapplying the technologies.

"In many respects, we're fighting the last intelligence war," Mr. Arquillasaid. "We have not pursued data mining in the way we should."

Mr. Arquilla, who was a consultant on Admiral Poindexter's TotalInformation Awareness project, said that the $40 billion spent each yearby intelligence agencies had failed to exploit the power of data mining incorrelating information readily available from public sources, likemonitoring Internet chat rooms used by Al Qaeda. Instead, he said, thegovernment has been investing huge sums in surveillance of phone calls ofAmerican citizens.


"Checking every phone call ever made is an example of old think," he said.

He was alluding to databases maintained at an AT&T data center in Kansas,which now contain electronic records of 1.92 trillion telephone calls,going back decades. The Electronic Frontier Foundation, a digital-rightsadvocacy group, has asserted in a lawsuit that the AT&T Daytona system, agiant storehouse of calling records and Internet message routinginformation, was the foundation of the N.S.A.'s effort to mine telephonerecords without a warrant.

An AT&T spokeswoman said the company would not comment on the claim, orgenerally on matters of national security or customer privacy.

But the mining of the databases in other law enforcement investigations iswell established, with documented results. One application of the databasetechnology, called Security Call Analysis and Monitoring Platform, orScamp, offers access to about nine weeks of calling information. Itcurrently handles about 70,000 queries a month from fraud and lawenforcement investigators, according to AT&T documents.

A former AT&T official who had detailed knowledge of the call-recorddatabase said the Daytona system takes great care to make certain thatanyone using the database — whether AT&T employee or law enforcementofficial with a subpoena — sees only information he or she is authorizedto see, and that an audit trail keeps track of all users. Such informationis frequently used to build models of suspects' social networks.

The official, speaking on condition of anonymity because he was discussingsensitive corporate matters, said every telephone call generated a record:number called, time of call, duration of call, billing category and otherdetails. While the database does not contain such billing data as names,addresses and credit card numbers, those records are in a linked databasethat can be tapped by authorized users.

New calls are entered into the database immediately after they end, theofficial said, adding, "I would characterize it as near real time."

According to a current AT&T employee, whose identity is being withheld toavoid jeopardizing his job, the mining of the AT&T databases had a notablesuccess in helping investigators find the perpetrators of what was knownas the Moldovan porn scam.

In 1997 a shadowy group in Moldova, a former Soviet republic, was trickingInternet users by enticing them to a pornography Web site that woulddownload a piece of software that disconnected the computer user from hislocal telephone line and redialed a costly 900 number in Moldova.

While another long-distance carrier simply cut off the entire nation ofMoldova from its network, AT&T and the Moldovan authorities were able tomine the database to track the culprits.

Much of the recent work on data mining has been aimed at even moresophisticated applications. The National Security Agency has investedbillions in computerized tools for monitoring phone calls around the world— not only logging them, but also determining content — and more recentlyin trying to design digital vacuum cleaners to sweep up information fromthe Internet.

Last September, the N.S.A. was granted a patent for a technique that couldbe used to determine the physical location of an Internet address —another potential category of data to be mined. The technique, whichexploits the tiny time delays in the transmission of Internet data,suggests the agency's interest in sophisticated surveillance tasks liketrying to determine where a message sent from an Internet address in acybercafe might have originated.

An earlier N.S.A. patent, in 1999, focused on a software solution forgenerating a list of topics from computer-generated text. Such a capacityhints at the ability to extract the content of telephone conversationsautomatically. That might permit the agency to mine millions of phoneconversations and then select a handful for human inspection.

As the N.S.A. visit to the Silicon Valley venture capitalists this monthindicates, the actual development of such technologies often comes fromprivate companies.

In 2003, Virage, a Silicon Valley company, began supplying a voicetranscription product that recognized and logged the text of televisionprogramming for government and commercial customers. Under perfectconditions, the system could be 95 percent accurate in capturing spokentext. Such technology has potential applications in monitoring phoneconversations as well.

And several Silicon Valley executives say one side effect of the 2003decision to cancel the Total Information Awareness project was that itkilled funds for a research project at the Palo Alto Research Center, asubsidiary of Xerox, exploring technologies that could protect privacywhile permitting data mining.

The aim was to allow an intelligence analyst to conduct extensive datamining without getting access to identifying information aboutindividuals. If the results suggested that, for instance, someone might bea terrorist, the intelligence agency could seek a court warrantauthorizing it to penetrate the privacy technology and identify the personinvolved.

With Xerox funds, the Palo Alto researchers are continuing to explore thetechnology.


Scott Shane contributed reporting from Washington for this article.

---------------------------------------------------------------------
To unsubscribe, e-mail: infowar -
de-unsubscribe -!
- infopeace -
de
For additional commands, e-mail: infowar -
de-help -!
- infopeace -
de

Follow-Ups:
- Re: [infowar.de] NSA Looks for More Ways to Mine Data
  - From: Bodo Staron

Prev by Date: [infowar.de] Verteiltes Rechnen gegen die U-Boot-Waffe
Next by Date: [infowar.de] .mil WHOIS server is offline
Previous by thread: [infowar.de] Verteiltes Rechnen gegen die U-Boot-Waffe
Next by thread: Re: [infowar.de] NSA Looks for More Ways to Mine Data
Index(es):
- Date
- Thread