There are a large number of metadata standards and initiatives that have relevance to digital preservation, e.g. those designed to support the work of national and research libraries, archives and digitization initiatives. This paper introduces some of these, noting that the developers of some have acknowledged the importance of maintaining or re-using existing metadata. It is argued here that the implementation of metadata registries as part of a digital preservation system may assist repositories in enabling the management and re-use of this metadata and may also help interoperability, namely the exchange of metadata and information packages between repositories.
Publisher
2003 Dublin Core Conference: Supporting Communities of Discourse and Practice-Metadata Research & Applications
Publication Location
Seatle, WA
Critical Arguements
CA "This paper will introduce a range of preservation metadata initiatives including the influential Open Archival Information System (OAIS) reference model and a number of other initiatives originating from national and research libraries, digitization projects and the archives community. It will then comment on the need for interoperability between these specifications and propose that the implementation of metadata registries as part of a digital preservation system may help repositories manage diverse metadata and facilitate the exchange of metadata or information packages between repositories."
Conclusions
RQ "The plethora of metadata standards and formats that have been developed to support the management and preservation of digital objects leaves us with several questions about interoperability. For example, will repositories be able to cope with the wide range of standards and formats that exist? Will they be able to transfer metadata or information packages containing metadata to other repositories? Will they be able to make use of the 'recombinant potential' of existing metadata?" ... "A great deal of work needs to be done before this registry-based approach can be proved to be useful. While it would undoubtedly be useful to have registries of the main metadata standards developed to support preservation, it is less clear how mapping-based conversions between them would work in practice. Metadata specifications are based on a range of different models and conversions often lead to data loss. Also, much more consideration needs to be given to the practical issues of implementation." 
SOW
DC Michael Day is a research officer at UKOLN, which is based at the University of Bath. He belongs to UKOLN's research and development team, and works primarily on projects concerning metadata, interoperability and digital preservation. 
Type
Conference Proceedings
Title
Preserving the Fabric of Our Lives: A Survey of Web Preservation Initiatives
This paper argues that the growing importance of the World Wide Web means that Web sites are key candidates for digital preservation. After an [sic] brief outline of some of the main reasons why the preservation of Web sites can be problematic, a review of selected Web archiving initiatives shows that most current initiatives are based on combinations of three main approaches: automatic harvesting, selection and deposit. The paper ends with a discussion of issues relating to collection and access policies, software, costs and preservation.
Secondary Title
Research and Advanced Technology for Digital Libraries, 7th European Conference, ECDL 2003, Trondheim, Norway, August 2003 Proceedings
Publisher
Springer
Publication Location
Berlin
Critical Arguements
CA "UKOLN undertook a survey of existing Web archiving initiatives as part of a feasibility study carried out for the Joint Information Systems Committee (JISC) of the UK further and higher education funding councils and the Library of the Wellcome Trust. After a brief description of some of the main problems with collecting and preserving the Web, this paper outlines the key findings of this survey." (p. 462) Addresses technical, legal and organizational challenges to archiving the World Wide Web. Surveys major attempts that have been undertaken to archive the Web, highlights the advantages and disadvantages of each, and discusses problems that remain to be addressed.
Conclusions
RQ "It is hoped that this short review of existing Web archiving initiatives has demonstrated that collecting and preserving Web sites is an interesting area of research and development that has now begun to move into a more practical implementation phase. To date, there have been three main approaches to collection, characterised in this report as 'automatic harvesting,' 'selection' and 'deposit.' Which one of these has been implemented has normally depended upon the exact purpose of the archive and the resources available. Naturally, there are some overlaps between these approaches but the current consensus is that a combination of them will enable their relative strengths to be utilised. The longer-term preservation issues of Web archiving have been explored in less detail." (p. 470)
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
This study focuses upon access to authentic electronic records that are no longer required in day-to-day operations and that have been set aside in a recordkeeping system or storage repository for future reference. One school of thought, generally associated with computer information technology specialists, holds that long-term access to electronic records is primarily a technological issue with little attention devoted to authenticity. Another school of thought, associated generally with librarians, archivists, and records managers, contends that long-term access to electronic records is as much an intellectual issue as it is a technological issue. This latter position is clearly evident in several recent research projects and studies about electronic records whose findings illuminate the discussion of long-term access to electronic records. Therefore, a review of eight research projects highlighting findings relevant for long-term access to electronic records begins this chapter. This review is followed by a discussion, from the perspective of archival science, of nine questions that a long-term access strategy must take into account. The nine issues are: What is a document?; What is a record?; What are authentic electronic records?; What does "archiving" mean?; What is an authentic reformatted electronic record?; What is a copy of an authentic electronic record?; What is an authentic converted electronic record?; What is involved in the migration of authentic electronic records?; What is technology obsolescence?
Book Title
Authentic Electronic Records: Strategies for Long-Term Access
Publisher
Cohasset Associates, Inc.
Publication Location
Chicago
ISBN
0970064004
Critical Arguements
CA "Building upon the key concepts and concerns articulated by the studies described above, this report attempts to move the discussion of long-term access to electronic records towarad more clearly identified, generally applicable and redily im(TRUNCATED)
Conclusions
RQ
SOW
DC This book chapter was written by Charles M. Dollar for Cohasset Associates, Inc. Mr. Dollar has "twenty-five years of experience in working with electronic records as a manager at the National Archives and Records Administration, as an archival educator at the University of British Columbia, and a consultant to governments and businesses in North America, Asia, Europe, and the Middle East." Cohasset Associates Inc. is "one of the nation's foremost consulting firms specializing in document-based information management."
Type
Journal
Title
Six degrees of separation: Australian metadata initiatives and their relationships with international standards
CA The record used to be annotated by hand, but with the advent of electronic business the record has now become unreliable and increasingly vulnerable to loss or corruption. Metadata is part of a recordkeeping regime instituted by the NAA to address this problem.
Phrases
<P1> Electronic metadata makes the digital world go round. The digital world also works better when there are standards. Standards encourage best practice. They help the end user by encouraging the adoption of common platforms and interfaces in different systems environments. (p. 275) <P2> In relation to Web-based publishing and online service delivery, the Strategy, which has Cabinet-level endorsement, requires all government agencies to comply with metadata and recordkeeping standards issued by the NAA. (p.276) <warrant>
Conclusions
RQ How do you effectively work with software vendors and government in order to encourage metadata schema adoption and use?
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
Type
Journal
Title
Digital preservation: Where we are, where we're going, where we need to be
CA Digital preservation will begin to come into its own. The past five years were about building access; now standards are coalescing and more focus is being paid to actual preservation strategies. Major legal obstacles include the DMCA, which restricts what institutions can do to preserve digital information. There are economic challenges, and we do not really know how much digital preservation will cost.
Phrases
<P1> There will be change, there is no guarantee that you can pick a technology and stay with it for ten years. We have to have an awareness of technological change and what's coming -- we listen to peers and the larger institutions that are taking leading and bleeding edge roles, and we make wise decisions. So in this case it is OK to be trailing edge and choose something that is well-established." (p.3)
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
Type
Electronic Journal
Title
A Spectrum of Interoperability: The Site for Science Prototype for the NSDL
"Currently, NSF is funding 64 projects, each making its own contribution to the library, with a total annual budget of about $24 million. Many projects are building collections; others are developing services; a few are carrying out targeted research.The NSDL is a broad program to build a digital library for education in science, mathematics, engineering and technology. It is funded by the National Science Foundation (NSF) Division of Undergraduate Education. . . . The Core Integration task is to ensure that the NSDL is a single coherent library, not simply a set of unrelated activities. In summer 2000, the NSF funded six Core Integration demonstration projects, each lasting a year. One of these grants was to Cornell University and our demonstration is known as Site for Science. It is at http://www.siteforscience.org/ [Site for Science]. In late 2001, the NSF consolidated the Core Integration funding into a single grant for the production release of the NSDL. This grant was made to a collaboration of the University Corporation for Atmospheric Research (UCAR), Columbia University and Cornell University. The technical approach being followed is based heavily on our experience with Site for Science. Therefore this article is both a description of the strategy for interoperability that was developed for Site for Science and an introduction to the architecture being used by the NSDL production team."
ISBN
1082-9873
Critical Arguements
CA "[T]his article is both a description of the strategy for interoperability that was developed for the [Cornell University's NSF-funded] Site for Science and an introduction to the architecture being used by the NSDL production team."
Phrases
<P1> The grand vision is that the NSDL become a comprehensive library of every digital resource that could conceivably be of value to any aspect of education in any branch of science and engineering, both defined very broadly. <P2> Interoperability among heterogeneous collections is a central theme of the Core Integration. The potential collections have a wide variety of data types, metadata standards, protocols, authentication schemes, and business models. <P3> The goal of interoperability is to build coherent services for users, from components that are technically different and managed by different organizations. This requires agreements to cooperate at three levels: technical, content and organizational. <P4> Much of the research of the authors of this paper aims at . . . looking for approaches to interoperability that have low cost of adoption, yet provide substantial functionality. One of these approaches is the metadata harvesting protocol of the Open Archives Initiative (OAI) . . . <P5> For Site for Science, we identified three levels of digital library interoperability: Federation; Harvesting; Gathering. In this list, the top level provides the strongest form of interoperability, but places the greatest burden on participants. The bottom level requires essentially no effort by the participants, but provides a poorer level of interoperability. The Site for Science demonstration concentrated on the harvesting and gathering, because other projects were exploring federation. <P6> In an ideal world all the collections and services that the NSDL wishes to encompass would support an agreed set of standard metadata. The real world is less simple. . . . However, the NSDL does have influence. We can attempt to persuade collections to move along the interoperability curve. <warrant> <P7> The Site for Science metadata strategy is based on two principles. The first is that metadata is too expensive for the Core Integration team to create much of it. Hence, the NSDL has to rely on existing metadata or metadata that can be generated automatically. The second is to make use of as much of the metadata available from collections as possible, knowing that it varies greatly from none to extensive. Based on these principles, Site for Science, and subsequently the entire NSDL, developed the following metadata strategy: Support eight standard formats; Collect all existing metadata in these formats; Provide crosswalks to Dublin Core; Assemble all metadata in a central metadata repository; Expose all metadata records in the repository for service providers to harvest; Concentrate limited human effort on collection-level metadata; Use automatic generation to augment item-level metadata. <P8> The strategy developed by Site for Science and now adopted by the NSDL is to accumulate metadata in the native formats provided by the collections . . . If a collection supports the protocols of the Open Archives Initiative, it must be able to supply unqualified Dublin Core (which is required by the OAI) as well as the native metadata format. <P9> From a computing viewpoint, the metadata repository is the key component of the Site for Science system. The repository can be thought of as a modern variant of the traditional library union catalog, a catalog that holds comprehensive catalog records from a group of libraries. . . . Metadata from all the collections is stored in the repository and made available to providers of NSDL service.
Conclusions
RQ 1 "Can a small team of librarians manage the collection development and metadata strategies for a very large library?" RQ 2 "Can the NSDL actually build services that are significantly more useful than the general web search services?"
CA Through OAI, access to resources is effected in a low-cost, interoperable manner.
Phrases
<P1> The need for a metadata format that would support both metadata creation by authors and interoperability across heterogeneous repositories led to the choice of unqualified Dublin Core. (p.16) <P2> OAI develops and promotes a low-barrier interoperability framework and associated standards, originally to enhance access to e-print archives, but now taking into account access to other digital materials. (p.16)
Conclusions
RQ The many players involved in cultural heritage need to work together to define standards and best practices.
CA Metadata is a key part of the information infrastructure necessary to organize and classify the massive amount of information on the Web. Metadata, just like the resources they describe, will range in quality and be organized around different principles. Modularity is critical to allow metadata schema designers to base their new creations on established schemas, thereby benefiting from best practices rather than reinventing elements each time. Extensibility and cost-effectiveness are also important factors. Controlled vocabularies provide greater precision and access. Multilingualism (translating specification documents into many languages) is an important step in fostering global metadata architecture(s).
Phrases
<P1> The use of controlled vocabularies is another important approach to refinement that improves the precision for descriptions and leverages the substantial intellectual investment made by many domains to improve subject access. (p.4) <P2> Standards typically deal with these issues through the complementary processes of internalization and localization: the former process relates to the creation of "neutral" standards, whereas the latter refers to the adaptation of such a neutral standard to a local context. (p.4)
Conclusions
RQ In order for the full potential of resource discovery that the Web could offer to be realized, a"convergence" of standards and semantics must occur.
Type
Electronic Journal
Title
Review: Some Comments on Preservation Metadata and the OAIS Model
CA Criticizes some of the limitations of OAIS and makes suggestions for improvements and clarifications. Also suggests that OAIS may be too library-centric, to the determinent of archival and especially recordkeeping needs. "In this article I have tried to articulate some of the main requirements for the records and archival community in preserving (archival) records. Based on this, the conclusion has to be that some adaptations to the [OAIS] model and metadata set would be necessary to meet these requirements. This concerns requirements such as the concept of authenticity of records, information on the business context of records and on relationships between records ('documentary context')."(p. 20)
Phrases
<P1> It requires records managers and archivists (and perhaps other information professionals) to be aware of these differences [in terminology] and to make a translation of such terms to their own domain. (p. 15) <P2> When applying the metadata model for a wider audience, more awareness of the issue of terminology is required, for instance by including clear definitions of key terms. (p. 15) <P3> The extent to which the management of objects can be influenced differs with respect to the type of objects. In the case of (government) records, legislation governs their creation and management, whereas, in the case of publications, the influence will be mostly based on agreements between producers, publishers and preservers. (p. 16) <P4> [A]lthough the suggestion may sometimes be otherwise, preservation metadata do not only apply to what is under the custody of a cultural or other preserving institution, but should be applied to the whole lifecycle of digital objects. ... Preservation can be viewed as part of maintenance. <warrant> (p. 16) <P5> [B]y taking library community needs as leading (albeit implicitly), the approach is already restricting the types of digital objects. Managing different types of 'digital objects', e.g. publications and records, may require not entirely similar sets of metadata. (p. 16) <P6> Another issue is that of the requirements governing the preservation processes. ... There needs to be insight and, as a consequence, also metadata about the preservation strategies, policies and methods, together with the context in which the preservation takes place. <warrant> (p. 16) <P7> [W]hat do we want to preserve? Is it the intellectual content with the functionality it has to have in order to make sense and achieve its purpose, or is it the digital components that are necessary to reproduce it or both? (p. 16-17) <P8> My view is that 'digital objects' should be seen as objects having both conceptual and technical aspects that are closely interrelated. As a consequence of the explanation given above, a digital object may consist of more than one 'digital component'. The definition given in the OAIS model is therefore insufficient. (p. 17) <P9> [W]e have no fewer than five metadata elements that could contain information on what should be rendered and presented on the screen. How all these elements relate to each other, if at all, is unclear. (p. 17) <P10> What we want to achieve ... is that in the future we will still be able to see, read and understand the documents or other information entities that were once produced for a certain purpose and in a certain context. In trying to achieve this, we of course need to preserve these digital components, but, as information technology will evolve, these components have to be migrated or in some cases emulated to be usable on future hard- and software platforms. (p. 17) <P11> I would like to suggest including an element that reflects the original technical environment. (p. 18) <P12> Records, according to the recently published ISO records management standard 15489, are 'information created, received and maintained as evidence and information by an organisation or person, in pursuance of legal obligations or in the transaction of business'. ... The main requirements for records to serve as evidence or authoritative information sources are ... authenticity and integrity, and knowledge about the business context and about the interrelationship between records (e.g. in a case file). <warrant> (p. 18) <P13> It would have been helpful if there had been more acknowledgement of the issue of authenticity and the requirements for it, and if the Working Group had provided some background information about its view and considerations on this aspect and to what extent it is included or not. (p. 19) <P14> In order to be able to preserve (archival) records it will ... be necessary to extend the information model with another class of information that refers to business context. Such a subset could provide a structure for describing what in archival terminology is called information about 'provenance' (with a different meaning from that in OAIS). (p. 19) <P15> In order to accommodate the identified complexity it is necessary to distinguish at least between the following categories of relationships: relationships between intellectual objects ... in the archival context this is referred to as 'documentary context'; relationships between the (structural) components of one intellectual object ... ; [and] relationships between digital components. (p. 19-20) <P16> [T]he issue of appraisal and disposition of records has to be included. In this context the recently published records management standard (ISO 15489) may serve as a useful framework. It would make the OAIS model even more widely applicable. (p. 20)
Conclusions
RQ "There are some issues ... which need further attention. They concern on the one hand the scope and underlying concepts of the OAIS model and the resulting metadata set as presented, and on the other hand the application of the model and metadata set in a records and archival environment. ... [T]he distinction between physical and conceptual or intellectual aspects of a digital object should be made more explicit and will probably have an impact on the model and metadata set also. More attention also needs to be given to the relationship between the (preservation) processes and the metadata. ... In assessing the needs of the records and archival community, the ISO records management standard 15489 may serve as a very useful framework. Such an exercise would also include a test for applicability of the model and metadata set for record-creating organisations and, as such, broaden the view of the OAIS model." (p. 20)
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
CA Describes efforts undertaken at the National Library of New Zealand to ensure preservation of electronic resources.
Phrases
<P1> The National Library Act 1965 provides the legislative framework for the National Library of New Zealand '... to collect, preserve, and make available recorded knowledge, particularly that relating to New Zealand, to supplement and further the work of other libraries in New Zealand, and to enrich the cultural and economic life of New Zealand and its cultural interchanges with other nations.' Legislation currently before Parliament, if enacted, will give the National Library the mandate to collect digital resources for preservation purposes. <warrant> (p. 18) <P2> So, the Library has an organisational commitment and may soon have the legislative environment to support the collection, management and preservation of digital objects. ... The next issue is what needs to be done to ensure that a viable preservation programme can actually be put in place. (p. 18) <P3> As the Library had already begun systematising its approach to resource discovery metadata, development of a preservation metadata schema for use within the Library was a logical next step. (p. 18) <P4> Work on the schema was initially informed by other international endeavours relating to preservation metadata, particularly that undertaken by the National Library of Australia. Initiatives through the CEDARS programme, OCLC/RLG activities and the emerging consensus regarding the role of the OAIS Reference Model ... were also taken into account. <warrant> (p. 18-19) <P5> The Library's Preservation Metadata schema is designed to strike a balance between the principles of preservation metadata, as expressed through the OAIS Information Model, and the practicalities of implementing a working set of preservation metadata. The same incentive informs a recent OCLC/RLG report on the OAIS model. (p. 19) <P6> [I]t is unlikely that anything resembling a comprehensive schema will become available in the short term. However, the need is pressing. (p. 19) <P7> The development of the preservation metadata schema is one component of an ongoing programme of activities needed to ensure the incorporation of digital material into the Library's core business processes with a view to the long-term accessibility of those resources. <warrant> (p. 19) <P8> The aim of the above activities is for the Library to be acknowledged as a 'trusted repository' for digital material which ensures the viability and authenticity of digital objects over time. (p. 20) <P9> The Library will also have to develop relationships with other organisations that might wish to achieve 'trusted repository' status in a country with a small population base and few agencies of appropriate size, funding and willingness to take on the role.
Conclusions
RQ There are still a number of important issues to be resolved before the Library's preservation programme can be deemed a success, including the need for: higher level of awareness of the need for digital preservation within the community of 'memory institutions' and more widely; metrics regarding the size and scope of the problem; finance to research and implement digital preservation; new skill sets for implementing digital preservation, e.g. running the multiplicity of hardware/software involved, digital conservation/archaeology; agreed international approaches to digital preservation; practical models to match the high level conceptual work already undertaken internationally; co-operation/collaboration between the wider range of agents potentially able to assist in developing digital preservation solutions, e.g. the computing industry; and, last but not least, clarity around intellectual property, copyright, privacy and moral rights.
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
Type
Electronic Journal
Title
Buckets: A new digital technology for preserving NASA research
CA Buckets are information objects designed to reduce dependency on traditional archives and database systems thereby making them more resilent to the transient nature of information systems.
Phrases
Another focus of aggregation was including the metadata with data. Through experiences NASA researchers found that metadata tended to "drift" over time, becoming decoupled from the data it described or locked in specific DL systems and hard to extract or share with other systems. (p. 377) Buckets are designed to imbue the information objects with certain responsibilities, such as display, dissemination, protection, and maintenance of its contents. As such, buckets should be able to work with many DL systems simultaneously, and minimize or eliminate the necessary modification of DL systems to work with buckets. Ideally, buckets should work with everything and break nothing. This philosophy is formalized in the SODA DL model. the objects become "smarter" at the expense of the archives (that become "dumber"), as functionalities generally associated with archives are moved into the data objects themselves. (p. 390)
Conclusions
RQ The creation of high quality tools for bucket creation and administration is absolutely necessary. The extension of authentication and security measures is key to supporting more technologies. Many applications of this sort of information object independence remains to be explored.
Type
Electronic Journal
Title
The Dublin Core Metadata Inititiative: Mission, Current Activities, and Future Directions
Metadata is a keystone component for a broad spectrum of applications that are emerging on the Web to help stitch together content and services and make them more visible to users. The Dublin Core Metadata Initiative (DCMI) has led the development of structured metadata to support resource discovery. This international community has, over a period of 6 years and 8 workshops, brought forth: A core standard that enhances cross-disciplinary discovery and has been translated into 25 languages to date; A conceptual framework that supports the modular development of auxiliary metadata components; An open consensus building process that has brought to fruition Australian, European and North American standards with promise as a global standard for resource discovery; An open community of hundreds of practitioners and theorists who have found a common ground of principles, procedures, core semantics, and a framework to support interoperable metadata.
Type
Report
Title
D6.2 Impact on World-wide Metadata Standards Report
This document presents the ARTISTE three-level approach to providing an open and flexible solution for combined metadata and image content-based search and retrieval across multiple, distributed image collections. The intended audience for this report includes museum and gallery owners who are interested in providing or extending services for remote access, developers of collection management and image search and retrieval systems, and standards bodies in both the fine art and digital library domains.
Notes
ARTISTE (http://www.artisteweb.org/) is a European Commission supported project that has developed integrated content and metadata-based image retrieval across several major art galleries in Europe. Collaborating galleries include the Louvre in Paris, the Victoria and Albert Museum in London, the Uffizi Gallery in Florence and the National Gallery in London.
Edition
Version 2.0
Publisher
The ARTISTE Consortium
Publication Location
Southampton, United Kindom
Accessed Date
08/24/05
Critical Arguements
<CA>  Over the last two and a half years, ARTISTE has developed an image search and retrieval system that integrates distributed, heterogeneous image collections. This report positions the work achieved in ARTISTE with respect to metadata standards and approaches for open search and retrieval using digital library technology. In particular, this report describes three key aspects of ARTISTE: the transparent translation of local metadata to common standards such as Dublin Core and SIMI consortium attribute sets to allow cross-collection searching; A methodology for combining metadata and image content-based analysis into single search galleries to enable versatile retrieval and navigation facilities within and between gallery collections; and an open interface for cross-collection search and retrieval that advances existing open standards for remote access to digital libraries, such as OAI (Open Archive Initiative) and ZING SRW (Z39.50 International: Next Generation Search and Retrieval Web Service).
Conclusions
RQ "A large part of ARTISTE is concerned with use of existing standards for metadata frameworks. However, one area where existing standards have not been sufficient is multimedia content-based search and retrieval. A proposal has been made to ZING for additions to SRW. This will hopefully enable ARTISTE to make a valued contribution to this rapidly evolving standard." ... "The work started in ARTISTE is being continued in SCULTEUR, another project funded by the European Commission. SCUPLTEUR will develop both the technology and the expertise to create, manage, and present cultural archives of 3D models and associated multimedia objects." ... "We believe the full benefit of multimedia search and retrieval can only be realised through seamless integration of content-based analysis techniques. However, not only does introduction of content-bases analysis require modification to existing standards as outlines in this report, but it also requires a review if the use of semantics in achieving digital library interoperability. In particular, machine understandable description of the semantics of textual metadata, multimedia content, and content-based analysis, can provide a foundation for a new generation of flexible and dynamic digital library tools and services. " ... "Existing standards do not use explicit semantics to describe query operators or their application to metadata and multimedia content at individual sites. However, dynamically determining what operators and types are supported by a collection is essential to robust and efficient cross-collection searching. Dynamic use of published semantics would allow a collection and any associated content-based analysis to be changed  by its owner without breaking conformance to search and retrieval standards. Furthermore, individual sites would not need to publish detailed, human readable descriptions of available functionality.  
SOW
DC "Four major European galleries are involved in the project: the Uffizi in Florence, the national Gallery and the Victoria and Albert Museum in London, and the Centre de Recherche et de Restauration des Musees de France (C2RMF) which is the Louvre related restoration centre. The ARTISTE system currently holds over 160,000 images from four separate collections owned by these partners. The galleries have partnered with NCR, leading player in database and Data Warehouse technology; Interactive Labs, the new media design and development facility of Italy's leading art publishing group, Giunti; IT Innovation, a specialist in building innovative IT systems, and the Department of Electronics and Computer Science at the University of Southhampton." 
This is one of a series of guides produced by the Cedars digital preservation project. This guide concentrates on the technical approaches that Cedars recommends as a result of its experience. The accent is on preservation, without which continued access is not possible. The time scale is at least decades, i.e. way beyond the lifetime of any hardware technology. The overall preservation strategy is to remove the data from its medium of acquisition and to preserve the digital content as a stream of bytes. There is good reason to be confident that data held as a stream of bytes can be preserved indefinitely. Just as there is no access without preservation, preservation with no prospect of future access is a very sterile exercise. As well as preserving the data as a byte-stream, Cedars adds in metadata. This includes reference to facilities (called technical metadata in this document) for accessing the intellectual content of the preserved data. This technical metadata will usually include actual software for use in accessing the data. It will be stored as a preserved object in the overall archive store, and will be revised as technology evolves making new methods of access to preserved objects appropriate. There will be big economies of scale, as most, if not all, objects of the same type will share the same technical metadata. Cedars recommends against repeated format conversions, and instead argues for keeping the preserved byte-stream, while tracking evolving technology by maintaining the technical metadata. It is for this reason that Cedars includes only a reference to the technical metadata in the preserved data object. Thus future users of the object will be pointed to information appropriate to their own era, rather than that of the object's preservation. The monitoring and updating of this aspect of the technical metadata is a vital function of the digital library. In practice, Cedars expects that very many preserved digital objects will be in the same format, and will reference the same technical metadata. Access to a preserved object then involves Migration on Request, in that any necessary migration from an obsolete format to an appropriate current day format happens at the point of request. As well as recommending actions to be taken to preserve digital objects, Cedars also recommends the use of a permanent naming scheme, with a strong recommendation that such a scheme should be infinitely extensible.
Critical Arguements
CA "This document is intended to inform technical practitioners in the actual preservation of digital materials, and also to highlight to library management the importance of this work as continuing their traditional scholarship role into the 21st century."
This document provides some background on preservation metadata for those interested in digital preservation. It first attempts to explain why preservation metadata is seen as an essential part of most digital preservation strategies. It then gives a broad overview of the functional and information models defined in the Reference Model for an Open Archival Information System (OAIS) and describes the main elements of the Cedars outline preservation metadata specification. The next sections take a brief look at related metadata initiatives, make some recommendations for future work and comment on cost issues. At the end there are some brief recommendations for collecting institutions and the creators of digital content followed by some suggestions for further reading.
Critical Arguements
CA "This document is intended to provide a brief introduction to current preservation metadata developments and introduce the outline metadata specifications produced by the Cedars project. It is aimed in particular at those who may have responsibility for digital preservation in the UK further and higher education community, e.g. senior staff in research libraries and computing services. It should also be useful for those undertaking digital content creation (digitisation) initiatives, although it should be noted that specific guidance on this is available elsewhere. The guide may also be of interest to other kinds of organisations that have an interest in the long-term management of digital resources, e.g. publishers, archivists and records managers, broadcasters, etc. This document aimes to provide: A rationale for the creation and maintenance of preservation metadata to support digital preservation strategies, e.g. migration or emulation; An introduction to the concepts and terminology used in the influential ISO Reference Model for an Open Archival Information System (OAIS); Brief information on the Cedars outline preservation metadata specification and the outcomes of some related metadata initiatives; Some notes on the cost implications of preservation metadata and how these might be reduced.
Conclusions
RQ "In June 2000, a group of archivists, computer scientists and metadata experts met in the Netherlands to discuss metadata developments related to recordkeeping and the long-term preservation of archives. One of the key conclusions made at this working meeting was that the recordkeeping metadata communities should attempt to co-operate more with other metatdata initiatives. The meeting also suggested research into the contexts of creation and use, e.g. identifying factors that might encourage or discourage creators form meeting recordkeeping metadata requirements. This kind of research would also be useful for wider preservation metadata developments. One outcome of this meeting was the setting up of an Archiving Metadata Forum (AMF) to form the focus of future developments." ... "Future work on preservation metadata will need to focus on several key issues. Firstly, there is an urgent need for more practical experience of undertaking digital preservation strategies. Until now, many preservation metadata initiatives have largely been based on theoretical considerations or high-level models like the OAIS. This is not in itself a bad thing, but it is now time to begin to build metadata into the design of working systems that can test the viability of digital preservation strategies in a variety of contexts. This process has already begun in initiatives like the Victorian Electronic Records Stategy and the San Diego Supercomputer Center's 'self-validating knowledge-based archives'. A second need is for increased co-operation between the many metadata initiatives that have an interest in digital preservation. This may include the comparison and harmonisation of various metadata specifications, where this is possible. The OCLC/LG working group is an example of how this has been taken forward whitin a particular domain. There is a need for additional co-operation with recordkeeping metadata specialists, computing scientists and others in the metadata research community. Thirdly, there is a need for more detailed research into how metadata will interact with different formats, preservation strategies and communities of users. This may include some analysis of what metadata could be automatically extracted as part of the ingest process, an investigation of the role of content creators in metadata provision, and the production of user requirements." ... "Also, thought should be given to the development of metadata standards that will permit the easy exchange of preservation metadata (and information packages) between repositories." ... "As well as ensuring that digital repositories are able to facilitate the automatic capture of metadata, some thought should also be given to how best digital repositories could deal with any metadata that might already exist."
SOW
DC "Funded by JISC (the Joint Information Systems Committee of the UK higher education funding councils), as part of its Electronic Libraries (eLib) Programme, Cedars was the only project in the programme to focus on digital preservation." ... "In the digitial library domain, the development of a recommendation on preservation metadata is being co-ordinated by a working group supported by OCLC and the RLG. The membership of the working group is international, and inlcudes key individuals who were involved in the development of the Cedars, NEDLIB and NLA metadata specifications."
The creation and use of metadata is likely to become an important part of all digital preservation strategies whether they are based on hardware and software conservation, emulation or migration. The UK Cedars project aims to promote awareness of the importance of digital preservation, to produce strategic frameworks for digital collection management policies and to promote methods appropriate for long-term preservation - including the creation of appropriate metadata. Preservation metadata is a specialised form of administrative metadata that can be used as a means of storing the technical information that supports the preservation of digital objects. In addition, it can be used to record migration and emulation strategies, to help ensure authenticity, to note rights management and collection management data and also will need to interact with resource discovery metadata. The Cedars project is attempting to investigate some of these issues and will provide some demonstrator systems to test them.
Notes
This article was presented at the Joint RLG and NPO Preservation Conference: Guidelines for Digital Imaging, held September 28-30, 1998.
Critical Arguements
CA "Cedars is a project that aims to address strategic, methodological and practical issues relating to digital preservation (Day 1998a). A key outcome of the project will be to improve awareness of digital preservation issues, especially within the UK higher education sector. Attempts will be made to identify and disseminate: Strategies for collection management ; Strategies for long-term preservation. These strategies will need to be appropriate to a variety of resources in library collections. The project will also include the development of demonstrators to test the technical and organisational feasibility of the chosen preservation strategies. One strand of this work relates to the identification of preservation metadata and a metadata implementation that can be tested in the demonstrators." ... "The Cedars Access Issues Working Group has produced a preliminary study of preservation metadata and the issues that surround it (Day 1998b). This study describes some digital preservation initiatives and models with relation to the Cedars project and will be used as a basis for the development of a preservation metadata implementation in the project. The remainder of this paper will describe some of the metadata approaches found in these initiatives."
Conclusions
RQ "The Cedars project is interested in helping to develop suitable collection management policies for research libraries." ... "The definition and implementation of preservation metadata systems is going to be an important part of the work of custodial organisations in the digital environment."
SOW
DC "The Cedars (CURL exemplars in digital archives) project is funded by the Joint Information Systems Committee (JISC) of the UK higher education funding councils under Phase III of its Electronic Libraries (eLib) Programme. The project is administered through the Consortium of University Research Libraries (CURL) with lead sites based at the Universities of Cambridge, Leeds and Oxford."
Type
Web Page
Title
Metadata for preservation : CEDARS project document AIW01
This report is a review of metadata formats and initiatives in the specific area of digital preservation. It supplements the DESIRE Review of metadata (Dempsey et al. 1997). It is based on a literature review and information picked-up at a number of workshops and meetings and is an attempt to briefly describe the state of the art in the area of metadata for digital preservation.
Critical Arguements
CA "The projects, initiatives and formats reviewed in this report show that much work remains to be done. . . . The adoption of persistent and unique identifiers is vital, both in the CEDARS project and outside. Many of these initiatives mention "wrappers", "containers" and "frameworks". Some thought should be given to how metadata should be integrated with data content in CEDARS. Authenticity (or intellectual preservation) is going to be important. It will be interesting to investigate whether some archivists' concerns with custody or "distributed custody" will have relevance to CEDARS."
Conclusions
RQ Which standards and initiatives described in this document have proved viable preservation metadata models?
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
Type
Web Page
Title
METS : Metadata Encoding and Transmission Standard
CA "METS, although in its early stages, is already sufficiently established amongst key digital library players that it can reasonably be considered the only viable standard for digital library objects in the foreseeable future. Although METS may be an excellent framework, it is just that and only that. It does not prescribe the content of the metadata itself, and this is a continuing problem for METS and all other schema to contend with if they are to realize their full functionality and usefulness."
Conclusions
RQ The standardization (via some sort of cataloging rules) of the content held by metadata "containers" urgently needs to be addressed. If not, the full value of any metadata scheme, no matter how extensible or robust, will not be realized.
CA The metadata necessary for successful management and use of digital objects is both more extensive than and different from the metadata used for managing collections of printed works and other physical materials. Without structural metadata, the page image or text files comprising the digital work are of little use, and without technical metadata regarding the digitization process, scholars may be unsure of how accurate a reflection of the original the digital version provides. For internal management purposes, a library must have access to appropriate technical metadata in order to periodically refresh and migrate the data, ensuring the durability of valuable resources.
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
This document is a revision and expansion of "Metadata Made Simpler: A guide for libraries," published by NISO Press in 2001.
Publisher
NISO Press
Critical Arguements
CA An overview of what metadata is and does, aimed at librarians and other information professionals. Describes various metadata schemas. Concludes with a bibliography and glossary.
Type
Web Page
Title
Preservation Metadata and the OAIS Information Model: A Metadata Framework to Support the Preservation of Digital Objects
CA "In March 2000, OCLC and RLG sponsored the creation of a working group to explore consensus-building in the area of preservation metadata. ... The charge of the group was to pool their expertise and experience to develop a preservation metadata framework applicable to a broad range of digital preservation activities." (p.1) "The OAIS information model offers a broad categorization of the types of information falling under the scope of preservation metadata; it falls short, however, of providing a decomposition of these information types into a list of metadata elements suitable for practical implementation. It is this need that the working group addressed in the course of its activities, the results of which are reported in this paper." (p. 47)
Conclusions
RQ "The metadata framework described in this paper can serve as a foundation for future work in the area of preservation metadata. Issues of particular importance include strategies and best practices for implementing preservation metadata in an archival system; assessing the degree of descriptive richness required by various types of digital preservation activities; developing algorithms for producing preservation metadata automatically; determining the scope for sharing preservation metadata in a cooperative environment; and moving beyond best practice towards an effort at formal standards building in this area." (47)
SOW
DC "[The OCLC and RLG working group] began its work by publishing a white paper entitled Preservation Metadata for Digital Objects: A Review of the State of the Art, which defined and discussed the concept of preservation metadata, reviewed current thinking and practice in the use of preservation metadata, and identified starting points for consensus-building activity in this area. The group then turned its attention to the main focus of its activity -- the collaborative development of a preservation metadata framework. This paper reports the results of the working groupÔÇÖs efforts in that regard." (p. 1-2)
Type
Web Page
Title
Metadata Resources: Metadata Encoding and Transmission Standard (METS)
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.