"Currently, NSF is funding 64 projects, each making its own contribution to the library, with a total annual budget of about $24 million. Many projects are building collections; others are developing services; a few are carrying out targeted research.The NSDL is a broad program to build a digital library for education in science, mathematics, engineering and technology. It is funded by the National Science Foundation (NSF) Division of Undergraduate Education. . . . The Core Integration task is to ensure that the NSDL is a single coherent library, not simply a set of unrelated activities. In summer 2000, the NSF funded six Core Integration demonstration projects, each lasting a year. One of these grants was to Cornell University and our demonstration is known as Site for Science. It is at http://www.siteforscience.org/ [Site for Science]. In late 2001, the NSF consolidated the Core Integration funding into a single grant for the production release of the NSDL. This grant was made to a collaboration of the University Corporation for Atmospheric Research (UCAR), Columbia University and Cornell University. The technical approach being followed is based heavily on our experience with Site for Science. Therefore this article is both a description of the strategy for interoperability that was developed for Site for Science and an introduction to the architecture being used by the NSDL production team."
ISBN
1082-9873
Critical Arguements
CA "[T]his article is both a description of the strategy for interoperability that was developed for the [Cornell University's NSF-funded] Site for Science and an introduction to the architecture being used by the NSDL production team."
Phrases
<P1> The grand vision is that the NSDL become a comprehensive library of every digital resource that could conceivably be of value to any aspect of education in any branch of science and engineering, both defined very broadly. <P2> Interoperability among heterogeneous collections is a central theme of the Core Integration. The potential collections have a wide variety of data types, metadata standards, protocols, authentication schemes, and business models. <P3> The goal of interoperability is to build coherent services for users, from components that are technically different and managed by different organizations. This requires agreements to cooperate at three levels: technical, content and organizational. <P4> Much of the research of the authors of this paper aims at . . . looking for approaches to interoperability that have low cost of adoption, yet provide substantial functionality. One of these approaches is the metadata harvesting protocol of the Open Archives Initiative (OAI) . . . <P5> For Site for Science, we identified three levels of digital library interoperability: Federation; Harvesting; Gathering. In this list, the top level provides the strongest form of interoperability, but places the greatest burden on participants. The bottom level requires essentially no effort by the participants, but provides a poorer level of interoperability. The Site for Science demonstration concentrated on the harvesting and gathering, because other projects were exploring federation. <P6> In an ideal world all the collections and services that the NSDL wishes to encompass would support an agreed set of standard metadata. The real world is less simple. . . . However, the NSDL does have influence. We can attempt to persuade collections to move along the interoperability curve. <warrant> <P7> The Site for Science metadata strategy is based on two principles. The first is that metadata is too expensive for the Core Integration team to create much of it. Hence, the NSDL has to rely on existing metadata or metadata that can be generated automatically. The second is to make use of as much of the metadata available from collections as possible, knowing that it varies greatly from none to extensive. Based on these principles, Site for Science, and subsequently the entire NSDL, developed the following metadata strategy: Support eight standard formats; Collect all existing metadata in these formats; Provide crosswalks to Dublin Core; Assemble all metadata in a central metadata repository; Expose all metadata records in the repository for service providers to harvest; Concentrate limited human effort on collection-level metadata; Use automatic generation to augment item-level metadata. <P8> The strategy developed by Site for Science and now adopted by the NSDL is to accumulate metadata in the native formats provided by the collections . . . If a collection supports the protocols of the Open Archives Initiative, it must be able to supply unqualified Dublin Core (which is required by the OAI) as well as the native metadata format. <P9> From a computing viewpoint, the metadata repository is the key component of the Site for Science system. The repository can be thought of as a modern variant of the traditional library union catalog, a catalog that holds comprehensive catalog records from a group of libraries. . . . Metadata from all the collections is stored in the repository and made available to providers of NSDL service.
Conclusions
RQ 1 "Can a small team of librarians manage the collection development and metadata strategies for a very large library?" RQ 2 "Can the NSDL actually build services that are significantly more useful than the general web search services?"
Type
Electronic Journal
Title
The Dublin Core Metadata Inititiative: Mission, Current Activities, and Future Directions
Metadata is a keystone component for a broad spectrum of applications that are emerging on the Web to help stitch together content and services and make them more visible to users. The Dublin Core Metadata Initiative (DCMI) has led the development of structured metadata to support resource discovery. This international community has, over a period of 6 years and 8 workshops, brought forth: A core standard that enhances cross-disciplinary discovery and has been translated into 25 languages to date; A conceptual framework that supports the modular development of auxiliary metadata components; An open consensus building process that has brought to fruition Australian, European and North American standards with promise as a global standard for resource discovery; An open community of hundreds of practitioners and theorists who have found a common ground of principles, procedures, core semantics, and a framework to support interoperable metadata.
CA There is great potential in developing a national standard for the control of records that combines traditional recordkeeping practices with continuum-based thinking and cutting-edge metadata.
Conclusions
RQ One challenge is integrating item-level metadata with system-level metadata. Linking old and new archival descriptive systems should be done as seamlessly as possible, since retrofitting would be too expensive. Another important area is linking contextual metadata to records whenever they are used outside their domain in order to provide "external validation" (p.17) <warrant>