BY DAWN LEVY
The Amazon.com computer knows your credit card number and the titles of the books you've bought. The computer at the pharmacy sees where you live and what prescriptions you need. The one at the online dating service stores the password you gave it -- the same password you use to access your bank accounts and e-mail.
Every day computers accumulate increasing amounts of sensitive data -- information that, if used improperly, could do harm. That includes private information about your health, your finances, your identity. Despite advances in cryptology, security, database systems and database mining, no comprehensive infrastructure exists for handling sensitive data over their lifetimes. And no widespread social agreement exists about the rights and responsibilities of data subjects, data owners and data users in a networked world. Collaborators in a new National Science Foundation (NSF) project aim to change that.
"Our goal is basically to invent and build tools that will help organizations mine data while preserving privacy," says principal investigator Dan Boneh, an associate professor of computer science and electrical engineering at Stanford. "There is a tension between individuals' privacy rights and the need for, say, law enforcement to process sensitive information." For example, a law enforcement agent might want to search several airline databases to find individuals who satisfy certain criteria. "How do we search these databases while preserving privacy of people who do not match the criteria?" Boneh asks. Similar questions apply to health and financial databases.
The new project is one of eight large endeavors funded this year by NSF's Information Technology Research program. Funded partners, who will receive $12.5 million over five years, are Stanford, Yale, the University of New Mexico, New York University and the Stevens Institute of Technology. Non-funded affiliates include the U.S. Secret Service, the U.S. Census Bureau, the Department of Health and Human Services, Microsoft, IBM, Hewlett-Packard, Citigroup, the Center for Democracy and Technology, and the Electronic Privacy Information Center.
Government and business both want more access to data, says Yale University computer science Professor Joan Feigenbaum, who holds a Stanford doctorate and is one of the investigators on this project. Individuals want the advantages that can result from data collection and analysis -- but not the disadvantages. "Use of transaction data and surveillance data need to be consistent with basic U.S. constitutional structure and with basic social and business norms," she says.
For the NSF project, technologists, lawyers, policy advocates and domain experts will team up to explore ways to meet potentially conflicting goals -- respecting individual rights and allowing organizations to collect and mine massive data sets. They will participate in biannual workshops and professional meetings, collaborate on publications and jointly advise student and postdoctoral researchers.
The researchers hope, for example, to develop tools to manage sensitive data in peer-to-peer (P2P) networks, where hundreds or even millions of users share data, music, images, movies and even academic papers without having a centralized web server; the shared files are distributed throughout the computers of the users. But those computers also may hold private files that users don't want to share.
In addition, the researchers will explore ways to enforce database policies. For example, privacy-preserving policies need to be better integrated into database management systems to ensure compliance with laws such as the Health Insurance Portability and Accountability Act (HIPAA), Feigenbaum says.
The participants also hope to create a new generation of technology that can thwart what Boneh calls "the fastest growing crime in the U.S. and in the world" -- identity theft. A substantial amount happens online, at spoofed websites pretending to be something they're not to entice you to enter sensitive information, such as a credit card or Social Security number. The spoofer can then use that information to apply for credit cards in your name or otherwise usurp your digital persona.
Stanford Report, October 1, 2003