CONTACT: Stanford University News Service (650) 723-2558
Creation of a 'virtual library' goal of new research project
STANFORD -- In the days when most information was stored on paper, getting access to it was often a major problem. Now, however, with more and more critical information being stored electronically, and becoming increasingly accessible over computer networks, the biggest difficulty is rapidly becoming finding the information you are looking for among the billions and billions of bits of digital data.
Addressing this issue is the main impetus for the Stanford Integrated Digital Library project that last week received $3.6 million from the federal government.
It is one of six digital library projects funded as part of a joint $24.4 million initiative of the National Science Foundation, the Department of Defense Advanced Research Projects Agency and the National Aeronautics and Space Administration.
"We see these projects as taking the next step - and a very large one - in our ability to make available vast stores of knowledge and innovative information services . . . to researchers, students, educators and the general public," said Paul Young, assistant director of NSF's Directorate for Computer and Information Science and Engineering, in a news release.
A number of the other projects involve setting up prototype digital libraries. But the goal of the four-year Stanford effort is to create a “virtual library” that will give Internet users a seamless interface to the wide variety of information sources and collections becoming available on the network.
"Today, digital libraries come in a number of different architectures and file structures. This variety makes it very difficult for people to find the information they are looking for. So we intend to develop a common environment that links everything from personal information to library collections, to large research databases," says Stanford computer science Professor Hector Garcia-Molina, who is the lead scientist on the project.
Co-principal scientists are Terry Winograd, professor of computer science, and Yoav Shoham, associate professor of computer science. Rebecca Lasher and Victoria Reich from Stanford University Libraries are involved in the project. Also participating are researchers from Dialog Information Services, Hewlett-Packard, NASA/Ames Research Center, the Association for Computing Machinery, Interconnect Technologies Corp., Enterprise Integration Technologies, Bell Communications Research, Interval Research Corp., O'Reilly and Associates, WAIS Inc., and Xerox Palo Alto Research Center.
The basis of this environment will be something that the researchers call an "information bus." It will consist of basic concepts, language and protocols that can tie together the materials, services and users of information. Special programs, called protocol machines, will be developed for specific digital libraries. These will translate between the library and the information bus. At the other end, special client interfaces will be developed that connect the user to the information bus.
"There must be better ways to collect, organize and share data in an environment consisting of millions of pieces of information, many of which are changing constantly," says Winograd, who will be working on client interfaces.
One of the interfaces they will be experimenting with is an information map. This will be like a street map, except that it will be mapping information structures, and will allow users to move about by pointing and clicking at different parts of the map. Another approach is to use animation: for example, moving down a data highway and passing roadsigns that tell users the information located in each block.
In addition to the technical problems it will address, the project will tackle such critical issues as the cost of information, protection of intellectual property rights, privacy and security of personal information.
"We will be attempting to build these into the basic protocols in the information bus," says Shoham.
Because society has not yet agreed on how to apply these basic principles to the digital realm, the researchers will attack these problems analytically and experimentally. They will develop possible solutions, build them into demos and see how well they work.
In the area of intellectual property rights, for example, the researchers will develop a "copy detection service," a registry where people can send their documents. Once documents are registered, the service can compare them with questionable documents to determine the extent of duplication. "In some preliminary tests we find that we get about a 5 percent random match," says Garcia-Molina.
Another type of service the scientists will be developing is one that allows users to create their own automated agents for library services.
"Agents are a general faculty for providing help. They can automate tasks, navigate to specific locations, notify you when certain conditions occur and speak to other agents," Shoham explains.
For example, a user might tell an agent to find all the information on the network involving digital libraries. The agent would then search all the libraries on the network for files including the term, arrange payment for the information in cases where there are charges and return it to the user's computer.
This is an archived release.
This release is not available in any other form.
Images mentioned in this release are not available online.