Artiste is a European project developing a cross-collection search system for art galleries and museums. It combines image content retrieval with text based retrieval and uses RDF mappings in order to integrate diverse databases. The test sites of the Louvre, Victoria and Albert Museum, Uffizi Gallery and National Gallery London provide their own database schema for existing metadata, avoiding the need for migration to a common schema. The system will accept a query based on one museumÔÇÖs fields and convert them, through an RDF mapping into a form suitable for querying the other collections. The nature of some of the image processing algorithms means that the system can be slow for some computations, so the system is session-based to allow the user to return to the results later. The system has been built within a J2EE/EJB framework, using the Jboss Enterprise Application Server.
Secondary Title
WWW2002: The Eleventh International World Wide Web Conference
Publisher
International World Wide Web Conference Committee
ISBN
1-880672-20-0
Critical Arguements
CA "A key aim is to make a unified retrieval system which is targeted to usersÔÇÖ real requirements and which is usable with integrated cross-collection searching. Museums and Galleries often have several digital collections ranging from public access images to specialised scientific images used for conservation purposes. Access from one gallery to another was not common in terms of textual data and not done at all in terms of image-based queries. However the value of cross-collection access is recognised as important for example in comparing treatments and conditions of paintings. While ARTISTE is primarily designed for inter-museum searching it could equally be applied to museum intranets. Within a MuseumÔÇÖs intranet there may be systems which are not interlinked due to local management issues."
Conclusions
RQ "The query language for this type of system is not yet standardised but we hope that an emerging standard will provide the session-based connectivity this application seems to require due to the possibility of long query times." ... "In the near future, the project will be introducing controlled vocabulary support for some of the metadata fields. This will not only make retrieval more robust but will also facilitate query expansion. The LouvreÔÇÖs multilingual thesaurus will be used in order to ensure greater interoperability. The system is easily extensible to other multimedia types such as audio and video (eg by adding additional query items such as "dialog" and "video sequence" with appropriate analysers). A follow-up project is scheduled to explore this further. There is some scope for relating our RDF query format to the emerging query standards such as XQuery and we also plan to feed our experience into standards such as the ZNG initiative.
SOW
DC "The Artiste project is a European Commission funded collaboration, investigating the use of integrated content and metadata-based image retrieval across disparate databases in several major art galleries across Europe. Collaborating galleries include the Louvre in Paris, the Victoria and Albert Museum in London, the Uffizi Gallery in Florence and the National Gallery in London." ... "Artiste is funded by the European CommunityÔÇÖs Framework 5 programme. The partners are: NCR, The University of Southampton, IT Innovation, Giunti Multimedia, The Victoria and Albert Museum, The National Gallery, The research laboratory of the museums of France (C2RMF) and the Uffizi Gallery. We would particularly like to thank our collaborators Christian Lahanier, James Stevenson, Marco Cappellini, John Cupitt, Raphaela Rimabosci, Gert Presutti, Warren Stirling, Fabrizio Giorgini and Roberto Vacaro."
This study focuses upon access to authentic electronic records that are no longer required in day-to-day operations and that have been set aside in a recordkeeping system or storage repository for future reference. One school of thought, generally associated with computer information technology specialists, holds that long-term access to electronic records is primarily a technological issue with little attention devoted to authenticity. Another school of thought, associated generally with librarians, archivists, and records managers, contends that long-term access to electronic records is as much an intellectual issue as it is a technological issue. This latter position is clearly evident in several recent research projects and studies about electronic records whose findings illuminate the discussion of long-term access to electronic records. Therefore, a review of eight research projects highlighting findings relevant for long-term access to electronic records begins this chapter. This review is followed by a discussion, from the perspective of archival science, of nine questions that a long-term access strategy must take into account. The nine issues are: What is a document?; What is a record?; What are authentic electronic records?; What does "archiving" mean?; What is an authentic reformatted electronic record?; What is a copy of an authentic electronic record?; What is an authentic converted electronic record?; What is involved in the migration of authentic electronic records?; What is technology obsolescence?
Book Title
Authentic Electronic Records: Strategies for Long-Term Access
Publisher
Cohasset Associates, Inc.
Publication Location
Chicago
ISBN
0970064004
Critical Arguements
CA "Building upon the key concepts and concerns articulated by the studies described above, this report attempts to move the discussion of long-term access to electronic records towarad more clearly identified, generally applicable and redily im(TRUNCATED)
Conclusions
RQ
SOW
DC This book chapter was written by Charles M. Dollar for Cohasset Associates, Inc. Mr. Dollar has "twenty-five years of experience in working with electronic records as a manager at the National Archives and Records Administration, as an archival educator at the University of British Columbia, and a consultant to governments and businesses in North America, Asia, Europe, and the Middle East." Cohasset Associates Inc. is "one of the nation's foremost consulting firms specializing in document-based information management."
Type
Journal
Title
Migration Strategies within an Electronic Archive: Practical Experience and Future Research
Pfizer Central Research, Sandwich, England has developed an Electronic Archive to support the maintenance and preservation of electronic records used in the discovery and development of new medicines. The Archive has been developed to meet regulatory, scientific and business requirements. The long-term preservation of electronic records requires that migration strategies be developed both for the Archive and the records held within the Archive. The modular design of the Archive will facilitate the migration of hardware components. Selecting an appropriate migration strategy for electronic records requires careful project management skills allied to appraisal and retention management. Having identified when the migration of records is necessary, it is crucial that alternative technical solutions remain open.
DOI
10.1023/A:1009093604632
Critical Arguements
CA Describes a system of archiving and migration of electronic records (Electronic Archive) at Pfizer Central Research. "Our objective is to provide long-term, safe and secure storage for electronic records. The archive acts as an electronic record center and borrows much from traditional archive theory." (p. 301)
Phrases
<P1> Migration, an essential part of the life-cycle of electronic records, is not an activity that occurs in isolation. It is deeply related to the "Warrant" which justifies our record-keeping systems, and to the metadata which describe the data on our systems. (p. 301-302) <warrant> <P2> Our approach to electronic archiving, and consequently our migration strategy, has been shaped by the business requirements of the Pharmaceutical industry, the technical infrastructure in which we work, the nature of scientific research and development, and by new applications for traditional archival skills. <warrant> (p. 302) <P3> The Pharmaceutical industry is regulated by industry Good Practice Guidelines such as Good Laboratory Practice, Good Clinical Practice and GoodManufacturing Practice. Adherence to these standards is monitored by Government agencies such as the U.S. Food and Drug Administration (FDA) and in Britain the Department of Health (DoH). The guidelines require that data relating to any compound used in man be kept for the lifetime of that compound during its use in man. This we may take to be 40 years or more, during which time the data must remain identifiable and reproducible in case of regulatory inspection. <warrant> (p. 302) <P4> The record-keeping requirements of the scientific research and development process also shape migration strategies. ... Data must be able to be manipulated as well as being identifiable and legible. <warrant> (p. 303) <P5> [W]e have adapted traditional archival theory to our working environment and the new imperatives of electronic archiving. We have utilised retention scheduling to provide a vehicle for metadata file description alongside retention requirements. We have also placed great importance on appraisal as a tool to evaluate records which require to be migrated. (p. 303) <P6> Software application information is therefore collected as part of the metadata description for each file. (p. 303) <P7> The migration of the database fromone version to another or to a new schema represents a significant migration challenge in terms of the project management and validation necessary to demonstrate that a new database accurately represents our original data set. (p. 303-304) <P8> Assessing the risk of migration exercises is only one of several issues we have identified which need to be addressed before any migration of the archive or its components takes place. (p. 304) <P9> [F]ew organisations can cut themselves off totally from their existing record-keeping systems, whether they be paper or electronic. (p. 304) <P10> Critical to this model is identifying the data which are worthy of long-term preservation and transfer to the Archive. This introduces new applications for the retention and appraisal of electronic records. Traditional archival skills can be utilised in deciding which records are worthy of retention. Once they are in the Archive it will become critical to return time and again to those records in a process of "constant review" to ensure that records remain, identifiable, legible and manipulatable. (p. 305) <P11> Having decided when to migrate electronic records, it is important to decide if it is worth it. Our role in Records Management is to inform the business leaders and budget holders when a migration of electronic records will be necessary. It is also our role to provide the business with an informed decision. A key vehicle in this process will be the retention schedule, which is not simply a tool to schedule the destruction of records. It could also be used to schedule software versions. More importantly, with event driven requirements it is a vehicle for constant review and appraisal of record holdings. The Schedule also defines important parts of the metadata description for each file in the Archive. The role of appraisal is critical in evaluating record holdings from a migration point of view and will demand greater time and resources from archivists and records managers. (p. 305)
Conclusions
RQ "Any migration of electronic records must be supported by full project management. Migration of electronic records is an increasingly complex area, with the advent of relational databases, multi-dimensional records and the World Wide Web. New solutions must be found, and new research undertaken. ... To develop a methodology for the migration of electronic records demands further exploration of the role of the "warrant" both external and internal to any organisation, which underpins electronic record-keeping practices. It will become critical to find new and practical ways to identify source software applications. ... The role of archival theory, especially appraisal and retention scheduling, in migration strategies demands greater consideration. ... The issues raised by complex documents are perhaps the area which demands the greatest research for the future. In this respect however, the agenda is being set by vendors promoting new technologies with short-term business goals. It may appear that electronic records do not lend themselves to long-term preservation. ... The development, management and operation of an Electronic Archive and migration strategy demands a multitude of skills that can only be achieved by a multi-disciplinary team of user, records management, IT, and computing expertise. Reassuringly, the key factor in migrating electronic archives will remain people." (p. 306)
Type
Journal
Title
Archival Issues in Network Electronic Publications
"Archives are retained information systems that are developed according to professional principles to meet anticipated demands of user clienteles in the context of the changing conditions created by legal environments and electronic or digital technologies. This article addresses issues in electronic publishing, including authentication, mutability, reformatting, preservation, and standards from an archival perspective. To ensure continuing access to electronically published texts, a special emphasis is placed on policy planning in the development and implementation of electronic systems" (p.701).
Critical Arguements
<P1> Archives are established, administered, and evaluated by institutions, organizations, and individuals to ensure the retention, preservation, and utilization of archival holdings (p.701) <P2> The three principal categories of archival materials are official files of institutions and organizations, publications issued by such bodies, and personal papers of individuals. . . . Electronic information technologies have had profound effects on aspects of all these categories (p.702) <P3> The primary archival concern with regard to electronic publishing is that the published material should be transferred to archival custody. When the transfer occurs, the archivist must address the issues of authentication, appraisal, arrangement, description, and preservation or physical protection (p.702) <P4> The most effective way to satisfy archival requirements for handling electronic information is the establishment of procedures and standards to ensure that valuable material is promptly transferred to archival custody in a format which will permit access on equipment that will be readily available in the future (p.702) <P5> Long-term costs and access requirements are the crucial factors in determining how much information should be retained in electronic formats (p.703) <P6> Authentication involves a determination of the validity or integrity of information. Integrity requires the unbroked custody of a body of information by a responsible authority or individual <warrant> (p.703) <P7> From an archival perspective, the value of information is dependent on its content and the custodial responsibility of the agency that maintains it -- e.g., the source determines authenticity. The authentication of archival information requires that it be verified as to source, date, and content <warrant> (p.704) <P8> Information that is mutable, modifiable, or changeable loses its validity if the persons adding, altering, or deleting information cannot be identified and the time, place and nature of the changes is unknown (p.704) <P9> [P]reservation is more a matter of access to information than it is a question of survival of any physical information storage media (p.704) <P10> [T]o approach the preservation of electronic texts by focusing on physical threats will miss the far more pressing matter of ensuring continued accessibility to the information on such storage media (p.706) <P11> If the information is to remain accessible as long as paper, preservation must be a front-end, rather than an ex post facto, action (p.708) <P12> [T]he preservation of electronic texts is first and foremost a matter of editorial and administrative policy rather than of techniques and materials (p.708) <P13> Ultimately, the preservation of electronic publications cannot be solely an archival issue but an administrative one that can be addressed only if the creators and publishers take an active role in providing resources necessary to ensure that ongoing accesibility is part of initial system and product design (p.709) <P14> An encouraging development is that SGML has been considered to be a critical element for electronic publishing because of its transportability and because it supports multiple representations of a single text . . . (p.711) <P15> Underlying all questions of access is the fundamental consideration of cost (p.711)
Type
Journal
Title
Warrant and the Defintion of Electronic Records: Questions Arising from the Pittsburgh Project
The University of Pittsburgh Electronic Recordkeeping Research Project established a model for developing functional requirements and metadata specifications based on warrant, defined as the laws, regulations, best practices, and customs that regulate recordkeeping. Research has shown that warrant can also increase the acceptance by records creators and others of functional requirements for recordkeeping. This article identifies areas related to warrant that require future study. The authors conclude by suggesting that requirements for recordkeeping may vary from country to country and industry to industry because of differing warrant.
Publisher
Kluwer Academic Publishers
Publication Location
The Netherlands
Critical Arguements
CA Poses a long series of questions and issues concerning warrant and its ability to increase the acceptance of recordkeeping requirements. Proposes that research be done to answer these questions. Discusses two different views about whether warrant can be universal and/or international.
Phrases
<P1> As we proceeded with the project [the University of Pittsburgh Electronic Recordkeeping Research Project] we ultimately turned our attention to the idea of the literary warrant -- defined as the mandate from law, professional best practices, and other social sources requiring the creation and continued maintenance of records. Wendy Duff's doctoral research found that warrant can increase the acceptance of some recordkeeping functional requirements, and therefore it has the potential to build bridges between archival professionals and others concerned with or responsible for recordkeeping. We did not anticipate the value of the literary warrant and, in the hindsight now available to us, the concept of the warrant may turn out to be the most important outcome of the project. <P2> In Wendy Duff's dissertation, legal, auditing and information science experts evluated the authority of the sources of warrant for recordkeeping. This part of the study provided evidence that information technology standards may lack authority, but this finding requires further study. Moreover, the number of individuals who evaluated the sources of warrant was extremely small. A much larger number of standards should be included in a subsequent study and a greater number of subjects are needed to evaluate these standards. <P3> We found a strong relationship between warrant and the functional requirements for electronic recordkeeping systems. Research that studies this relationship and determines the different facets that may affect it might provide more insights into the relationship between the warrant and the functional requirements. <P4> [W]e need to develop a better understanding of the degree to which the warrant for recordkeeping operates in various industries, disciplines, and other venues. Some institutions operate in a much more regulated environment than others, suggesting that the imporance of records and the understanding of records may vary considerably between institutional types, across disciplines and from country to country. <P5> We need to consider whether the recordkeeping functional requirements for evidence hold up or need to be revised for recordkeeping requirements for corporate memory, accountability, and cultural value -- the three broad realms now being used to discuss records and recordkeeping. <P6> The warrant gathered to date has primarily focused on technical, legal or the administrative value of records. A study that tested the effectiveness of warrant that supported the cultural or historical mandate of archives might help archivists gain support for their archival programs. <P7> This concern leads us to a need for more research about the understanding of records and recordkeeping in particular institutions, disciplines, and societies. <P8> A broader, and perhaps equally important question, is whether individual professionals and workers are even aware of their regulatory environment. <P9> How do the notion of the warrant and the recordkeeping functional requirements relate to the ways in which organizations work and the management tools they use, such as business process reengineering and data warehousing? <P10> What are the economic implications for organizations to comply with the functional requirements for recordkeeping in evidence? <P11> Is there a warrant and separate recordkeeping functional requirements for individual or personal recordkeeping? <P12> As more individuals, especially writers, financial leaders, and corporate and societal innovators, adopt electronic information technologies for the creation of their records, an understanding of the degree of warrant for such activity and our ability to use this warrant to manage these recordkeeping systems must be developed. <P13> We believe that archivists and records managers can imporve their image if they become experts in all aspects of recordkeeping. This will require a thorough knowledge of the legal, auditing, information technology, and management warrant for recordkeeping. <P14> The medical profession emphasizes that [sic] need to practice evidence-based medicine. We need to find out what would happen if records managers followed suit, and emphasized and practiced warrant-based recordkeeping. Would this require a major change in what we do, or would it simply be a new way to describe what we have always done? <P15> More work also has to be done on the implications of warrant and the functional requirements for the development of viable archives and records management programs. <P16> The warrant concept, along with the recordkeeping functional requirements, seem to possess immense pedagogical implications for what future archivists or practicing archivists, seeking to update their skills, should or would be taught. <P17> We need to determine the effectiveness of using the warrant and recordkeeping functional requirements as a basis for graduate archival and records management education and for developing needed topics for research by masters and doctoral students. <P18> The next generation of educational programs might be those located in other professional schools, focusing on the particular requirements for records in such institutions as corporations, hospitals, and the courts. <P19> We also need to determine the effectiveness of using the warrant and recordkeeping functional requirements in continuing education, public outreach, and advocacy for helping policy makers, resource allocators, administrators, and others to understand the importance of archives and records. Can the warrant and recordkeeping functional requirements support or foster stronger partnerships with other professions, citizen action groups, and other bodies interested in accountability in public organizations and government? <P20> Focusing on the mandate to keep and manage records, instead of the records as artifacts or intersting stuff, seems much more relevant in late twentieth century society. <P21> We need to investigate the degree to which records managers and archivists can develop a universal method for recordkeeping. ... Our laws, regulations, and best practices are usually different from country to country. Therefore, must any initiative to develop warrant also be bounded by our borders? <P22> A fundamental difference between the Pittsburgh Project and the UBC project is that UBC wishes to develop a method for managing and preserving electronic records that is applicable across all juridical systems and cultures, while the Pittsburgh Project is proposing a model that enables recordkeeping to be both universal and local at the same time. <P23> We now have a records management standard from Australia which is relevant for most North American records programs. It has been proposed as an international standard, although it is facing opposition from some European countries. Can there be an international standard for recordkeeping and can we develop one set of procedures which will be accepted across nations? Or must methods of recordkeeping be adapted to suit specific cultures, juridical systems, or industries?
Conclusions
RQ See above.
Type
Journal
Title
How Do Archivists Make Electronic Archives Usable and Accessible?
CA In order to make electronic archives useable, archivists will need to enhance and link access systems to facilitate resource discovery while making the whole process as seamless and low-cost (or no-cost) as possible for the user.
Phrases
<P1> Rather than assuming that the archival community will succeed in transferring all valuable electronic records to archival institutions for preservation and future access, archivists must develop strategies and methods for accessibility and usability that can span a variety of custodial arrangements. (p.9) <P2> Maintaining linkages between different formats of materials will become increasingly burdensome if archvists do not find ways to develop integrated access systems. (p.10) <P3> Archivists must also think about ways to teach users the principles of a new digital diplomatics so that they can apply these principles themselves to make educated judgements about the accuracy, reliability, and authenticity of the documents they retrieve from electronic archives. (p.15)
Type
Journal
Title
Research Issues in Australian Approaches to Policy Development
Drawing on his experience at the Australian Archives in policy development on electronic records and recordkeeping for the Australian Federal Government sector the author argues for greater emphasis on the implementation side of electronic records management. The author questions whether more research is a priority over implementation. The author also argues that if archival institutions wish to be taken seriously by their clients they need to pay greater attention to getting their own organisations in order. He suggests the way to do this is by improving internal recordkeeping practices and systems and developing a resource and skills base suitable for the delivery of electronic recordkeeping policies and services to clients.
Publisher
Kluwer Academic Publishers
Publication Location
Netherlands
Critical Arguements
CA "None of the issues which have been raised regarding the management of electronic records are insurmountable or even difficult from a technological viewpoint. The technology is there to develop electronic recordkeeping systems. The technology is there to capture and maintain electronic records. The technology is there to enable access over time. The technology is there to enable recordkeeping at a level of sophistication and accuracy hitherto undreamt of. To achieve our goal though requires more than technology, remember that is part of the problem. To achieve our goal requires human understanding, planning, input and motivation and that requires us to convince others that it is worth doing. This view has a significant impact on the development of research agendas and implementation projects." (p. 252) "Looking at electronic records from a strategic recordkeeping perspective requires us to see beyond the specific technology issues toward the wider corporate issues, within our organizational, professional and environmental sectors. In summary they are: Building alliances: nationally and internationally; Re-inventing the archival function: cultural change in the archives and recordscommunity and institutions; Getting our own house in order: establishing archival institutions as models of best practice for recordkeeping; Devoting resources to strategic developments; and Re-training and re-skilling archivists and records managers." (p. 252-253)
Phrases
<P1> The issue for me therefore is the development of a strategic approach to recordkeeping, whether it be in Society generally, whole of Government, or in your own corporate environment. The wider focus should be on the development of recordkeeping systems, and specifically electronic recordkeeping systems. Without such a strategic approach I believe our efforts on electronic records will largely be doomed to failure. (p. 252) <P2> We have to influence recordkeeping practices in order to influence the creation and management of electronic records. (p. 253) <P3> Given that there is no universal agreement within the archives and records community to dealing with electronic records how can we expect to successfully influence other sectoral interests and stake-holders, not to mention policy makers and resource providers? Institutions and Professional bodies have to work together and reach agreement and develop strategic positions. (p. 253) <P4> The emerging role of recordkeeping professionals is to define recordkeeping regimes for organizations and their employees, acting as consultants and establishing and monitoring standards, rather than deciding about specific records in specific recordkeeping systems or creating extensive documentation about them. (p. 254) <P5> Archival institutions need to practice what they preach and develop as models for best practice in recordkeeping. (p. 254-255) <P6> Resources devoted to electronic records and recordkeeping policy and implementation within archival institutions has not been commensurate with the task. (p. 255) <P7> Contact with agencies needs to be more focused at middle and senior management to ensure that the importance of accountability and recordkeeping is appreciated and that strategies and systems are put in place to ensure that records are created, kept and remain accessible. (p. 255) <P8> In order to do this for electronic records archival institutions need to work with agencies to: assist in the development of recordkeeping systems through the provision of appropriate advice; identify electronic records in their custody which are of enduring value; identify and dispose of electronic records in their custody which are not of enduring value; assist agencies in the identification of information or metadata which needs to be captured and maintained; provide advice on access to archival electronic records. (p. 255-256) <P9> The elements of the records continuum need to be reflected as components in the business strategy for archival institutions in the provision of services to its clients. (p. 256)
Conclusions
RQ "In summary I see the unresolved issues and potential research tasks as follows: International Agreement (UN, ICA); National Agreement (Government, Corporate, Sectoral, Professional); Cultural Change in the Archives and Records Community; Re-inventing / re-engineering Archives institutions; Re-training or recruiting; Best practice sites -- the National Archives as a model for best practice recordkeeping; Test sites for creation, capture, migration and networking of records; Functional analysis and appraisal of electronic information systems (electronic recordkeeping systems); Costing the retention of electronic records and records in electronic form." (p. 257)
Type
Journal
Title
Structuring the Records Continuum Part Two: Structuration Theory and Recordkeeping
In the previous issue of Archives and Manuscripts I presented the first part of this two part exploration. It dealt with some possible meanings for 'post' in the term postcustodial. For archivists, considerations of custody are becoming more complex because of changing social, technical and legal considerations. These changes include those occurring in relation to access and the need to document electronic business communications reliably. Our actions, as archivists, in turn become more complex as we attempt to establish continuity of custody in electronic recordkeeping environments. In this part, I continue the case for emphasising the processes of archiving in both our theory and practice. The archives as a functional structure has dominated twentieth century archival discourse and institutional ordering, but we are going through a period of transformation. The structuration theory of Anthony Giddens is used to show that there are very different ways of theorising about our professional activities than have so far been attempted within the archival profession. Giddens' theory, at the very least, provides a useful device for gaining insights into the nature of theory and its relationship with practice. The most effective use of theory is as a way of seeing issues. When seen through the prism of structuration theory, the forming processes of the virtual archives are made apparent.
Critical Arguements
CA "This part of my exploration of the continuum will continue the case for understanding 'postcustodial' as a bookmark term for a major transition in archival practice. That transition involves leaving a long tradition in which continuity was a matter of sequential control. Electronic recordkeeping processes need to incorporate continuity into the essence of recordkeeping systems and into the lifespan of documents within those systems. In addressing this issue I will present a structurationist reading of the model set out in Part 1, using the sophisticated theory contained in the work of Anthony Giddens. Structuration theory deals with process, and illustrates why we must constantly re-assess and adjust the patterns for ordering our activities. It gives some leads on how to go about re-institutionalising these new patterns. When used in conjunction with continuum thinking, Giddens' meta-theory and its many pieces can help us to understand the complexities of the virtual archives, and to work our way towards the establishment of suitable routines for the control of document management, records capture, corporate memory, and collective memory."
Phrases
<P1> Broadly the debate has started to form itself as one between those who represent the structures and functions of an archival institution in an idealised form, and those who increasingly concentrate on the actions and processes which give rise to the record and its carriage through time and space. In one case the record needs to be stored, recalled and disseminated within our institutional frameworks; in the other case it is the processes for storing, recalling, and disseminating the record which need to be placed into a suitable framework. <P2> Structure, for Giddens, is not something separate from human action. It exists as memory, including the memory contained within the way we represent, recall, and disseminate resources including recorded information. <P3> Currently in electronic systems there is an absence of recordkeeping structures and disconnected dimensions. The action part of the duality has raced ahead of the structural one; the structuration process has only just begun. <P4> The continuum model's breadth and richness as a conceptual tool is expanded when it is seen that it can encompass action-structure issues in at least three specialisations within recordkeeping: contemporary recordkeeping - current recordkeeping actions and the structures in which they take place; regulatory recordkeeping - the processes of regulation and the enabling and controlling structures for action such as policies, standards, codes, legislation, and promulgation of best practices; historical recordkeeping - explorations of provenance in which action and structure are examined forensically as part of the data sought about records for their storage, recall and dissemination. <P5> The capacity to imbibe information about recordkeeping practices in agencies will be crucial to the effectiveness of the way archival 'organisations' set up their postcustodial programs. They will have to monitor the distribution and exercise of custodial responsibilities for electronic records from before the time of their creation. <warrant> <P6> As John McDonald has pointed out, recordkeeping activities need to occur at desktop level within systems that are not dependent upon the person at the desktop understanding all of the details of the operation of that system. <P7> Giddens' more recent work on reflexivity has many parallels with metadata approaches to recordkeeping. What if the records, as David Bearman predicts, can be self-managing? Will they be able to monitor themselves? <P8> He rejects the life cycle model in sociology, based on ritualised passages through life, and writes of 'open experience thresholds'. Once societies, for example, had rites for coming of age. Coming of age in a high modern society is now a complex process involving a host of experiences and risks which are very different to that of any previous generation. Open experience threshholds replace the life cycle thresholds, and as the term infers, are much less controlled or predictable. <P9> There is a clear parallel with recordkeeping in a high modern environment. The custodial thresholds can no longer be understood in terms of the spatial limits between a creating agency and an archives. The externalities of the archives as place will decline in significance as a means of directly asserting the authenticity and reliability of records. The complexities of modern recordkeeping involve many more contextual relationships and an ever increasing network of relationships between records and the actions that take place in relation to them. We have no need for a life cycle concept based on the premise of generational repetition of stages through which a record can be expected to pass. We have entered an age of more recordkeeping choices and of open experience thresholds. <P10> It is the increase in transactionality, and the technologies being used for those transactions, which are different. The solution, easier to write about than implement, is for records to parallel Giddens' high modern individual and make reflexive use of the broader social environment in which they exist. They can reflexively monitor their own action and, with encoding help from archivists and records managers, resolve their own crises as they arise. <warrant> <P11> David Bearman's argument that records can be self-managing goes well beyond the easy stage. It is supported by the Pittsburgh project's preliminary set of metadata specifications. The seeds of self-management can be found in object oriented programming, java, applets, and the growing understanding of the importance and nature of metadata. <P12> Continuum models further assist us to conceive of how records, as metadata encapsulated objects, can resolve many of their own life crises as they thread their way through time and across space. <P13> To be effective monitors of action, archival institutions will need to be recognised by others as the institutions most capable of providing guidance and control in relation to the integration of the archiving processes involved in document management, records capture, the organisation of corporate memory and the networking of archival systems. <warrant> <P14> Signification, in the theoretical domain, refers to our interpretative schemes and the way we encode and communicate our activities. At a macro level this includes language itself; at a micro level it can include our schemes for classification and ordering. <P15> The Pittsburgh project addressed the three major strands of Giddens' theoretical domain. It explored and set out functional requirements for evidence - signification. It sought literary warrants for archival tasks - legitimation. It reviewed the acceptability of the requirements for evidence within organisational cultures - domination. <P16> In Giddens' dimensional approach, the theoretical domain is re-defined to be about coding, organising our resources, and developing norms and standards. In this area the thinking has already begun to produce results, which leads this article in to a discussion of structural properties. <P17> Archivists deal with structural properties when, for example, they analyse the characteristics of recorded information such as the document, the record, the archive and the archives. The archives as a fortress is an observable structural property, as is the archives as a physical accumulation of records. Within Giddens' structuration theory, when archivists write about their favourite features, be they records or the archives as a place, they are discussing structural properties. <P18> Postcustodial practice in Australia is already beginning to put together a substantial array of structural properties. These developments are canvassed in the article by O'Shea and Roberts in the previous issue of Archives and Manuscripts. They include policies and strategies, standards, recordkeeping regimes, and what has come to be termed distributed custody. <P19> As [Terry] Eastwood comments in the same article, we do not have adequate electronic recordkeeping systems. Without them there can be no record in time-space to serve any form of accountability. <warrant> <P20> In the Pittsburgh project, for example, the transformation of recordkeeping processes is directed towards the creation and management of evidence, and possible elements of a valid rule-resource set have emerged. Elements can include the control of recordkeeping actions, accountability, the management of risk, the development of recordkeeping regimes, the establishment of recordkeeping requirements, and the specification of metadata. <P21> In a postcustodial approach it is the role of archival institutions to foster better recordkeeping practices within all the dimensions of recordkeeping. <warrant>
Conclusions
RQ "Best practice in the defence of the authoritative qualities of records can no longer be viewed as a linear chain, and the challenge is to establish new ways of legitimating responsibilities for records storage and custody which recognise the shifts which have occurred." ... "The recordkeeping profession should seek to establish itself as ground cover, working across terrains rather than existing tree-like in one spot. Beneath the ground cover there are shafts of specialisation running both laterally and vertically. Perhaps we can, as archivists, rediscover something that a sociologist like Giddens has never forgotten. Societies, including their composite parts, are the ultimate containers of recorded information. As a place in society, as Terry Cook argues, the archives is a multiple reality. We can set in train policies and strategies that can help generate multiplicity without losing respect for particular mine shafts. Archivists have an opportunity to pursue policies which encourage the responsible exercising of a custodial role throughout society, including the professions involved in current, regulatory and historical recordkeeping. If we take up that opportunity, our many goals can be better met and our concerns will be addressed more effectively."
SOW
DC "Frank Upward is a senior lecturer in the Department of Librarianship, Archives and Records at Monash University. He is an historian of the ideas contained in the Australian records continuum approach, and an ex- practitioner within that approach." ... "These two articles, and an earlier one on Ian Maclean and the origins of Australian continuum thinking, have not, so far, contained appropriate acknowledgements. David Bearman provided the necessary detonation of certain archival practices, and much more. Richard Brown and Terry Cook drew my attention to Anthony Giddens' work and their own work has helped shape my views. I have many colleagues at Monash who encourage my eccentricities. Sue McKemmish has helped shape my ideas and my final drafts and Barbara Reed has commented wisely on my outrageous earlier drafts. Livia Iacovino has made me stop and think more about the juridical tradition in recordkeeping. Chris Hurley produced many perspectives on the continuum during the 1996 seminars which have helped me see the model more fully. Don Schauder raised a number of key questions about Giddens as a theorist. Bruce Wearne of the Sociology Department at Monash helped me lift the clarity of my sociological explanations and made me realise how obsessed Giddens is with gerunds. The structural-functionalism of Luciana Duranti and Terry Eastwood provided me with a counterpoint to many of my arguments, but I also owe them debts for their respective explorations of recordkeeping processes and the intellectual milieu of archival ideas, and for their work on the administrative-juridical tradition of recordkeeping. Glenda Acland has provided perceptive comments on my articles - and supportive ones, for which I am most grateful given how different the articles are from conventional archival theorising. Australian Archives, and its many past and present staff members, has been important to me."
Type
Journal
Title
Structuring the Records Continuum Part One: Post-custodial principles and properties
The records continuum is becoming a much used term, but has seldom been defined in ways which show it is a time/space model not a life of the records model. Dictionary definitions of a continuum describe such features as its continuity, the indescernibility of its parts, and the way its elements pass into each other. Precise definitions, accordingly, have to discern the indiscernible, identify points that are not distinct, and do so in ways which accomodate the continuity of change. This article, and a second part to be published in the next volume, will explore the continuum in time/space terms supported by a theoretical mix of archival science, postmodernity and the 'structuration theory' of Anthony Giddens. In this part the main objectives are to give greater conceptual firmness to the continuum; to clear the way for broader considerations of the nature of the continuum by freeing archivists from the need to debate custody; to show how the structural principles for archival practice are capable of different expression without losing contact with something deeper that can outlive the manner of expression.
Critical Arguements
CA "This is the first instalment of a two part article exploring the records continuum. Together the articles will build into a theory about the constitution of the virtual archives. In this part I will examine what it can mean to be 'postcustodial', outline some possible structural principles for the virtual archives, and present a logical model for the records continuum." ... "In what follows in the remainder of this article (and all of the next) , I will explore the relevance of [Anthony] Giddens' theory to the structuring of the records continuum."
Phrases
<P1> If the archival profession is to avoid a fracture along the lines of paper and electronic media, it has to be able to develop ways of expressing its ideas in models of relevance to all ages of recordkeeping, but do so in ways which are contemporaneous with our own society. <warrant> <P2> We need more of the type of construct provided by the Pittsburgh Project's functional requirements for evidence which are 'high modern' but can apply to recordkeeping over time. <P3> What is essential is for electronic records to be identified, controlled and accessible for as long as they have value to Government and the Community. <warrant> <P4> We have to face up to the complexification of ownership, possession, guardianship and control within our legal system. Even possession can be broken down into into physical possession and constructed possession. We also have to face the potential within our technology for ownership, possession, custody or control to be exercised jointly by the archives, the organisation creating the records, and auditing agencies. The complexity requires a new look at our way of allocating authorities and responsibilities. <P5> In what has come to be known as the continuum approach Maclean argued that archivists should base their profession upon studies of the characteristics of recorded information, recordkeeping systems, and classification (the way the records were ordered within recordkeeping systems and the way these were ordered through time). <P6> A significant role for today's archival institution is to help to identify and establish functional requirements for recordkeeping that enable a more systematic approach to authentication than that provided by physical custody. <warrant> <P7> In an electronic work environment it means, in part, that the objectivity, understandability, availability, and usability of records need to be inherent in the way that the record is captured. In turn the documents need to be captured in the context of the actions of which they are part, and are recursively involved. <warrant> <P8>A dimensional analysis can be constructed from the model and explained in a number of ways including a recordkeeping system reading. When the co-ordinates of the continuum model are connected, the different dimensions of a recordkeeping system are revealed. The dimensions are not boundaries, the co-ordinates are not invariably present, and things may happen simultaneously across dimensions, but no matter how a recordkeeping system is set up it can be analysed in terms such as: first dimensional analysis: a pre- communication system for document creation within electronic systems [creating the trace]; second dimensional analysis: a post- communication system, for example traditional registry functionality which includes registration, the value adding of data for linking documents and disseminating them, and the maintenance of the record including disposition data [capturing trace as record]; third dimensional analysis: a system involving building, recalling and disseminating corporate memory [organising the record as memory]; fourth dimensional analysis: a system for building, recalling and disseminating collective memory (social, cultural or historical) including information of the type required for an archival information system [pluralizing the memory]. <P9> In the high modern recordkeeping environment of the 1990's a continuum has to take into account a different array of recordkeeping tools. These tools, plucking a few out at random but ordering the list dimensionally, include: document management software, Australian records system software, the intranet and the internet. <P10> In terms of a records continuum which supports an evidence based recordkeeping approach, the second dimension is crucial. This is where the document is disembedded from the immediate contexts of the first dimension. It is this disembedding process that gives the record its value as a 'symbolic token'. A document is embedded in an act, but the document as a record needs to be validatable using external reference points. These points include the operation of the recordkeeping system into which it was received, and information pertaining to the technical, social (including business) and communication processes of which the document was part.
Conclusions
RQ "Postcustodial approaches to archives and records cannot be understood if they are treated as a dualism. They are not the opposite of custody. They are a response to opportunities for asserting the role of an archives - and not just its authentication role - in many re-invigorating ways, a theme which I will explore further in the next edition of Archives and Manuscripts."
SOW
DC "Frank Upward is a senior lecturer in the Department of Librarianship, Archives and Records at Monash University. He is an historian of the ideas contained in the Australian records continuum approach, and an ex-practitioner within that approach."
Type
Journal
Title
Managing the Present: Metadata as Archival Description
Traditional archival description undertaken at the terminal stages of the life cycle has had two deleterious effects on the archival profession. First, it has resulted in enormous, and in some cases, insurmountable processing backlogs. Second, it has limited our ability to capture crucial contextual and structural information throughout the life cycle of record-keeping systems that are essential for fully understanding the fonds in our institutions. This shortcoming has resulted in an inadequate knowledge base for appraisal and access provision. Such complications will only become more magnified as distributed computering and complex software applications continue to expand throughout organizations. A metadata strategy for archival description will help mitigate these problems and enhance the organizational profile of archivists who will come to be seen as valuable organizational knowledge and accountability managers.
Critical Arguements
CA "This essay affirms this call for evaluation and asserts that the archival profession must embrace a metadata systems approach to archival description and management." ... "It is held here that the requirements for records capture and description are the requirements for metadata."
Phrases
<P1> New archival organizational structures must be created to ensure that records can be maintained in a usable form. <warrant> <P2> The recent report of Society of American Archivists (SAA) Committee on Automated Records and Techniques (CART) on curriculum development has argued that archivists need to "understand the nature and utility of metadata and how to interpret and use metadata for archival purposes." <warrant> <P3> The report advises archivists to acquire knowledge on the meanings of metadata, its structures, standards, and uses for the management of electronic records. Interestingly, the requirements for archival description immediately follow this section and note that archivists need to isolate the descriptive requirements, standards, documentiation, and practices needed for managing electronic records. <warrant> <P4> Clearly, archivists need to identify what types of metadata will best suit their descriptive needs, underscoring the need for the profession to develop strategies aand tactics to satisfy these requirements within active software environments. <warrant> <P5> Underlying the metadata systems strategy for describing and managing electronic information technologies is the seemingly universal agreement amongst electronic records archivists on the requirement to intervene earlier in the life cycle of electronic information systems. <warrant> <P6> Metadata has loomed over the archival management of electronic records for over five years now and is increasingly being promised as a basic control strategy for managing these records. <warrant> <P7> However, she [Margaret Hedstrom] also warns that as descriptive practices shift from creating descriptive information to capturing description along with the records, archivists may discover that managing the metadata is a much greater challenge than managing the records themselves. <P8> Archivists must seek to influence the creation of record-keeping systems within organizations by connecting the transaction that created the data to the data itself. Such a connection will link informational content, structure, and the context of transactions. Only when these conditions are met will we have records and an appropriate infrastructure for archival description. <warrant> <P9> Charles Dollar has argued that archivists increasingly will have to rely upon and shape the metadata associated with electronic records in order to fully capture provenance information about them. <warrant> <P10> Bearman proposes a metadata systems strategy, which would focus more explicitly on the context out of which records arise, as opposed to concentrating on their content. This axiom is premised on the assumption that "lifecycle records systems control should drive provenance-based description and link to top-down definitions of holdings." <warrant> <P11> Bearman and Margaret Hedstrom have built upon this model and contend that properly specified metadata capture could fully describe sytems while they are still active and eliminate the need for post-hoc description. The fundamental change wrought in this approach is the shift from doing things to records (surveying, scheduling, appraising, disposing/accessioning, describing, preserving, and accessing) to providing policy direction for adequate documentation through management of organizational behavior (analyzing organizational functions, defining business transactions, defining record metadata, indentifying control tactics, and establishing the record-keeping regime). Within this model archivists focus on steering how records will be captured (and that they will be captured) and how they will be managed and described within record-keeping systems while they are still actively serving their parent organization. <P12> Through the provision of policy guidance and oversight, organizational record-keeping is managed in order to ensure that the "documentation of organizational missions, functions, and responsibilities ... and reporting relationships within the organization, will be undertaken by the organizations themselves in their administrative control systems." <warrant> <P13> Through a metadata systems approach, archivists can realign themselves strategically as managers of authoritative information about organizational record-keeping systems, providing for the capture of information about each system, its contextual attributes, its users, its hardware configurations, its software configurations, and its data configurations. <warrant> <P14> The University of Pittsburgh's functional requirements for record-keeping provides a framework for such information management structure. These functional requirements are appropriately viewed as an absolute ideal, requiring testing within live systems and organizations. If properly implemented, however, they can provide a concrete model for metadata capture that can automatically supply many of the types of descriptive information both desired by archivists and required for elucidating the context out of which records arise. <P15> It is possible that satisfying these requirements will contribute to the development of a robust archival description process integrating "preservation of meaning, exercise of control, and provision of access'" within "one prinicipal, multipurpose descriptive instrument" hinted at by Luciana Duranti as a possible outcome of the electronic era. <P16> However, since electronic records are logical and not physical entities, there is no physical effort required to access and process them, just mental modelling. <P17> Depending on the type of metadata that is built into and linked to electronic information systems, it is possible that users can identify individual records at the lowest level of granularity and still see the top-level process it is related to. Furthermore, records can be reaggregated based upon user-defined criteria though metadata links that track every instance of their use, their relations to other records, and the actions that led to their creation. <P18> A metadata strategy for archival description will help to mitigate these problems and enhance the organizational profile of archivists, who will come to be seen as valuable organizational knowledge and accountability managers. <warrant>
Conclusions
RQ "First and foremost, the promise of metadata for archival description is contingent upon the creation of electronic record-keeping systems as opposed to a continuation of the data management orientation that seems to dominate most computer applications within organizations." ... "As with so many other aspects of the archival endeavour, these requirements and the larger metadata model for description that they are premised upon necessitate further exploration through basic research."
SOW
DC "In addition to New York State, recognition of the failure of existing software applications to capture a full compliment of metadata required for record-keeping and the need for such records management control has also been acknowledged in Canada, the Netherlands, and the World Bank." ... "In conjunction with experts in electronic records managment, an ongoing research project at the University of Pittsburgh has developed a set of thirteen functional requirements for record-keeping. These requirements provide a concrete metadata tool sought by archivists for managing and describing electronic records and electronic record-keeping systems." ... David A. Wallace is an Assistant Professor at the School of Information, University of Michigan, where he teaches in the areas of archives and records management. He holds a B.A. from Binghamton University, a Masters of Library Science from the University at Albany, and a doctorate from the University of Pittsburgh. Between 1988 and 1992, he served as Records/Systems/Database Manager at the National Security Archive in Washington, D.C., a non-profit research library of declassified U.S. government records. While at the NSA he also served as Technical Editor to their "The Making of U.S. Foreign Policy" series. From 1993-1994, he served as a research assistant to the University of Pittsburgh's project on Functional Requirements for Evidence in Recordkeeping, and as a Contributing Editor to Archives and Museum Informatics: Cultural Heritage Informatics Quarterly. From 1994 to 1996, he served as a staff member to the U.S. Advisory Council on the National Information Infrastructure. In 1997, he completed a dissertation analyzing the White House email "PROFS" case. Since arriving at the School of Information in late 1997, he has served as Co-PI on an NHPRC funded grant assessing strategies for preserving electronic records of collaborative processes, as PI on an NSF Digital Government Program funded planning grant investigating the incorporation of born digital records into a FOIA processing system, co-edited Archives and the Public Good: Accountability and Records in Modern Society (Quorum, 2002), and was awarded ARMA International's Britt Literary Award for an article on email policy. He also serves as a consultant to the South African History Archives Freedom of Information Program and is exploring the development of a massive digital library of declassified imaged/digitized U.S. government documents charting U.S. foreign policy.
Type
Electronic Journal
Title
ARTISTE: An integrated Art Analysis and Navigation Environment
This article focuses on the description of the objectives of the ARTISTE project (for "An integrated Art Analysis and Navigation environment") that aims at building a tool for the intelligent retrieval and indexing of high resolution images. The ARTISTE project will address professional users in the fine arts as the primary end-user base. These users provide services for the ultimate end-user, the citizen.
Critical Arguements
CA "European museums and galleries are rich in cultural treasures but public access has not reached its full potential. Digital multimedia can address these issues and expand the accessible collections. However, there is a lack of systems and techniques to support both professional and citizen access to these collections."
Phrases
<P1> New technology is now being developed that will transform that situation. A European consortium, partly funded by the EU under the fifth R&D framework, is working to produce a new management system for visual information. <P2> Four major European galleries (The Uffizi in Florence, The National Gallery and the Victoria and Albert Museum in London and the Louvre related restoration centre, Centre de Recherche et de Restauration des Mus├®es de France) are involved in the project. They will be joining forces with NCR, a leading player in database and Data Warehouse technology; Interactive Labs, the new media design and development facility of Italy's leading art publishing group, Giunti; IT Innovation, Web-based system developers; and the Department of Electronics and Computer Science at the University of Southampton. Together they will create web based applications and tools for the automatic indexing and retrieval of high-resolution art images by pictorial content and information. <P3> The areas of innovation in this project are as follows: Using image content analysis to automatically extract metadata based on iconography, painting style etc; Use of high quality images (with data from several spectral bands and shadow data) for image content analysis of art; Use of distributed metadata using RDF to build on existing standards; Content-based navigation for art documents separating links from content and applying links according to context at presentation time; Distributed linking and searching across multiple archives allowing ownership of data to be retained; Storage of art images using large (>1TeraByte) multimedia object relational databases. <P4> The ARTISTE approach will use the power of object-related databases and content-retrieval to enable indexing to be made dynamically, by non-experts. <P5> In other words ARTISTE would aim to give searchers tools which hint at links due to say colour or brush-stroke texture rather than saying "this is the automatically classified data". <P6> The ARTISTE project will build on and exploit the indexing scheme proposed by the AQUARELLE consortia. The ARTISTE project solution will have a core component that is compatible with existing standards such as Z39.50. The solution will make use of emerging technical standards XML, RDF and X-Link to extend existing library standards to a more dynamic and flexible metadata system. The ARTISTE project will actively track and make use of existing terminology resources such as the Getty "Art and Architecture Thesaurus" (AAT) and the "Union List of Artist Names" (ULAN). <P7> Metadata will also be stored in a database. This may be stored in the same object-relational database, or in a separate database, according to the incumbent systems at the user partners. <P8> RDF provides for metadata definition through the use of schemas. Schemas define the relevant metadata terms (the namespace) and the associated semantics. Individual RDF queries and statements may use multiple schemas. The system will make use of existing schemas such as the Dublin Core schema and will provide wrappers for existing resources such as the Art and Architecture thesaurus in a RDF schema wrapper. <P9> The Distributed Query and Metadata Layer will also provide facilities to enable queries to be directed towards multiple distributed databases. The end user will be able to seamlessly search the combined art collection. This layer will adhere to worldwide digital library standards such as Z39.50, augmenting and extending as necessary to allow the richness of metadata enabled by the RDF standard.
Conclusions
RQ "In conclusion the Artiste project will result into an interesting and innovative system for the art analysis, indexing storage and navigation. The actual state of the art of content-based retrieval systems will be positively influenced by the development of the Artiste project, which will pursue the following goals: A solution which can be replicated to European galleries, museums, etc.; Deep-content analysis software based on object relational database technology.; Distributed links server software, user interfaces, and content-based navigation software.; A fully integrated prototype analysis environment.; Recommendations for the exploitation of the project solution by European museums and galleries. ; Recommendations for the exploitation of the technology in other sectors.; "Impact on standards" report detailing augmentations of Z39.50 with RDF." ... ""Not much research has been carried out worldwide on new algorithms for style-matching in art. This is probably not a major aim in Artiste but could be a spin-off if the algorithms made for specific author search requirements happen to provide data which can be combined with other data to help classify styles." >
SOW
DC "Four major European galleries (The Uffizi in Florence, The National Gallery and the Victoria and Albert Museum in London and the Louvre related restoration centre, Centre de Recherche et de Restauration des Mus├®es de France) are involved in the project. They will be joining forces with NCR, a leading player in database and Data Warehouse technology; Interactive Labs, the new media design and development facility of Italy's leading art publishing group, Giunti; IT Innovation, Web-based system developers; and the Department of Electronics and Computer Science at the University of Southampton. Together they will create web based applications and tools for the automatic indexing and retrieval of high-resolution art images by pictorial content and information."
Type
Electronic Journal
Title
Keeping Memory Alive: Practices for Preserving Digital Content at the National Digital Library Program of the Library of Congress
CA An overview of the major issues and initiatives in digital preservation at the Library of Congress. "In the medium term, the National Digital Library Program is focusing on two operational approaches. First, steps are taken during conversion that are likely to make migration or emulation less costly when they are needed. Second, the bit streams generated by the conversion process are kept alive through replication and routine refreshing supported by integrity checks. The practices described here provide examples of how those steps are implemented to keep the content of American Memory alive."
Phrases
<P1> The practices described here should not be seen as policies of the Library of Congress; nor are they suggested as best practices in any absolute sense. NDLP regards them as appropriate practices based on real experience, the nature and content of the originals, the primary purposes of the digitization, the state of technology, the availability of resources, the scale of the American Memory digital collection, and the goals of the program. They cover not just the storage of content and associated metadata, but also aspects of initial capture and quality review that support the long-term retention of content digitized from analog sources. <P2> The Library recognizes that digital information resources, whether born digital or converted from analog forms, should be acquired, used, and served alongside traditional resources in the same format or subject area. Such responsibility will include ensuring that effective access is maintained to the digital content through American Memory and via the Library's main catalog and, in coordination with the units responsible for the technical infrastructure, planning migration to new technology when needed. <P3> Refreshing can be carried out in a largely automated fashion on an ongoing basis. Migration, however, will require substantial resources, in a combination of processing time, out-sourced contracts, and staff time. Choice of appropriate formats for digital masters will defer the need for large-scale migration. Integrity checks and appropriate capture of metadata during the initial capture and production process will reduce the resource requirements for future migration steps. <warrant> We can be certain that migration of content to new data formats will be necessary at some point. The future will see industrywide adoption of new data formats with functional advantages over current standards. However, it will be difficult to predict exactly which metadata will be useful to support migration, when migration of master formats will be needed, and the nature and extent of resource needs. Human experts will need to decide when to undertake migration and develop tools for each migration step. <P4> Effective preservation of resources in digital form requires (a) attention early in the life-cycle, at the moment of creation, publication, or acquisition and (b) ongoing management (with attendant costs) to ensure continuing usability. <P5> The National Digital Library Program has identified several categories of metadata needed to support access and management for digital content. Descriptive metadata supports discovery through search and browse functions. Structural metadata supports presentation of complex objects by representing relationships between components, such as sequences of images. In addition, administrative metadata is needed to support management tasks, such as access control, archiving, and migration. Individual metadata elements may support more than one function, but the categorization of elements by function has proved useful. <P6> It has been recognized that metadata representations appropriate for manipulation and long-term retention may not always be appropriate for real-time delivery. <P7> It has also been realized that some basic descriptive metadata (at the very least a title or brief description) should be associated with the structural and administrative metadata. <P8> During 1999, an internal working group reviewed past experience and prototype exercises and compiled a core set of metadata elements that will serve the different functions identified. This set will be tested and refined as part of pilot activities during 2000. <P9> Master formats are well documented and widely deployed, preferably formal standards and preferably non-proprietary. Such choices should minimize the need for future migration or ensure that appropriate and affordable tools for migration will be developed by the industry. <warrant>
Conclusions
RQ "Developing long-term strategies for preserving digital resources presents challenges associated with the uncertainties of technological change. There is currently little experience on which to base predictions of how often migration to new formats will be necessary or desirable or whether emulation will prove cost-effective for certain categories of resources. ... Technological advances, while sure to present new challenges, will also provide new solutions for preserving digital content."
Type
Electronic Journal
Title
Primary Sources, Research, and the Internet: The Digital Scriptorium at Duke
First Monday, Peer Reviewed Journal on the Internet
Publication Year
1997
Volume
2
Issue
9
Critical Arguements
CA "As the digital revolution moves us ever closer to the idea of the 'virtual library,' repositories of primary sources and other archival materials have both a special opportunity and responsibility. Since the materials in their custody are, by definition, often unique, these institutions will need to work very carefully with scholars and other researchers to determine what is the most effective way of making this material accessible in a digital environment."
Phrases
<P1> The matter of Internet access to research materials and collections is not one of simply doing what we have always done -- except digitally. It represents instead an opportunity to rethink the fundamental triangular relationship between libraries and archives, their collections, and their users. <P2> Digital information as it exists on the Internet today requires more navigational, contextual, and descriptive data than is currently provided in traditional card catalogs or their more modern electronic equivalent. One simply cannot throw up vast amounts of textual or image-based data onto the World Wide Web and expect existing search engines to make much sense of it or users to be able to digest the results. ... Archivists and manuscript curators have for many years now been providing just that sort of contextual detail in the guides, finding aids, and indexes that they have traditionally prepared for their holdings. <P3> Those involved in the Berkeley project understood that HTML was essentially a presentational encoding scheme and lacked the formal structural and content-based encoding that SGML would offer. <P4> Encoded Archival Description is quickly moving towards become an internationally embraced standard for the encoding of archival metadata in a wide variety of archival repositories and special collections libraries. And the Digital Scriptorium at Duke has become one of the early implementors of this standard. <warrant>
Conclusions
RQ "Duke is currently involved in a project that is funded through NEH and also involves the libraries of Stanford, the University of Virginia, and the University of California-Berkeley. This project (dubbed the "American Heritage Virtual Digital Archives Project") will create a virtual archive of encoded finding aids from all four institutions. This archive will permit seamless searching of these finding aids -- at a highly granular level of detail -- through a single search engine on one site and will, it is hoped, provide a model for a more comprehensive national system in the near future."
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 1
The preservation of digital information for long periods of time is becoming feasible through the integration of archival storage technology from supercomputer centers, data grid technology from the computer science community, information models from the digital library community, and preservation models from the archivistÔÇÖs community. The supercomputer centers provide the technology needed to store the immense amounts of digital data that are being created, while the digital library community provides the mechanisms to define the context needed to interpret the data. The coordination of these technologies with preservation and management policies defines the infrastructure for a collection-based persistent archive. This paper defines an approach for maintaining digital data for hundreds of years through development of an environment that supports migration of collections onto new software systems.
ISBN
1082-9873
Critical Arguements
CA "Supercomputer centers, digital libraries, and archival storage communities have common persistent archival storage requirements. Each of these communities is building software infrastructure to organize and store large collections of data. An emerging common requirement is the ability to maintain data collections for long periods of time. The challenge is to maintain the ability to discover, access, and display digital objects that are stored within an archive, while the technology used to manage the archive evolves. We have implemented an approach based upon the storage of the digital objects that comprise the collection, augmented with the meta-data attributes needed to dynamically recreate the data collection. This approach builds upon the technology needed to support extensible database schema, which in turn enables the creation of data handling systems that interconnect legacy storage systems."
Phrases
<P1> The ultimate goal is to preserve not only the bits associated with the original data, but also the context that permits the data to be interpreted. <warrant> <P2> We rely on the use of collections to define the context to associate with digital data. The context is defined through the creation of semi-structured representations for both the digital objects and the associated data collection. <P3>A collection-based persistent archive is therefore one in which the organization of the collection is archived simultaneously with the digital objects that comprise the collection. <P4> The goal is to preserve digital information for at least 400 years. This paper examines the technical issues that must be addressed and presents a prototype implementation. <P5>Digital object representation. Every digital object has attributes that define its structure, physical context, and provenance, and annotations that describe features of interest within the object. Since the set of attributes (such as annotations) will vary across all objects within a collection, a semi-structured representation is needed. Not all digital objects will have the same set of associated attributes. <P6> If possible, a common information model should be used to reference the attributes associated with the digital objects, the collection organization, and the presentation interface. An emerging standard for a uniform data exchange model is the eXtended Markup Language (XML). <P7> A particular example of an information model is the XML Document Type Definition (DTD) which provides a description for the allowed nesting structure of XML elements. Richer information models are emerging such as XSchema (which provides data types, inheritance, and more powerful linking mechanisms) and XMI (which provides models for multiple levels of data abstraction). <P8> Although XML DTDs were originally applied to documents only, they are now being applied to arbitrary digital objects, including the collections themselves. More generally, OSDs can be used to define the structure of digital objects, specify inheritance properties of digital objects, and define the collection organization and user interface structure. <P9> A persistent collection therefore needs the following components of an OSD to completely define the collection context: Data dictionary for collection semantics; Digital object structure; Collection structure; and User interface structure. <P10> The re-creation or instantiation of the data collection is done with a software program that uses the schema descriptions that define the digital object and collection structure to generate the collection. The goal is to build a generic program that works with any schema description. <P11> The information for which driver to use for access to a particular data set is maintained in the associated Meta-data Catalog (MCAT). The MCAT system is a database containing information about each data set that is stored in the data storage systems. <P12> The data handling infrastructure developed at SDSC has two components: the SDSC Storage Resource Broker (SRB) that provides federation and access to distributed and diverse storage resources in a heterogeneous computing environment, and the Meta-data Catalog (MCAT) that holds systemic and application or domain-dependent meta-data about the resources and data sets (and users) that are being brokered by the SRB. <P13> A client does not need to remember the physical mapping of a data set. It is stored as meta-data associated with the data set in the MCAT catalog. <P14> A characterization of a relational database requires a description of both the logical organization of attributes (the schema), and a description of the physical organization of attributes into tables. For the persistent archive prototype we used XML DTDs to describe the logical organization. <P15> A combination of the schema and physical organization can be used to define how queries can be decomposed across the multiple tables that are used to hold the meta-data attributes. <P16> By using an XML-based database, it is possible to avoid the need to map between semi-structured and relational organizations of the database attributes. This minimizes the amount of information needed to characterize a collection, and makes the re-creation of the database easier. <warrant> <P17> Digital object attributes are separated into two classes of information within the MCAT: System-level meta-data that provides operational information. These include information about resources (e.g., archival systems, database systems, etc., and their capabilities, protocols, etc.) and data objects (e.g., their formats or types, replication information, location, collection information, etc.); Application-dependent meta-data that provides information specific to particular data sets and their collections (e.g., Dublin Core values for text objects). <P18> Internally, MCAT keeps schema-level meta-data about all of the attributes that are defined. The schema-level attributes are used to define the context for a collection and enable the instantiation of the collection on new technology. <P19> The logical structure should not be confused with database schema and are more general than that. For example, we have implemented the Dublin Core database schema to organize attributes about digitized text. The attributes defined in the logical structure that is associated with the Dublin Core schema contains information about the subject, constraints, and presentation formats that are needed to display the schema along with information about its use and ownership. <P20> The MCAT system supports the publication of schemata associated with data collections, schema extension through the addition or deletion of new attributes, and the dynamic generation of the SQL that corresponds to joins across combinations of attributes. <P21> By adding routines to access the schema-level meta-data from an archive, it is possible to build a collection-based persistent archive. As technology evolves and the software infrastructure is replaced, the MCAT system can support the migration of the collection to the new technology.
Conclusions
RQ Collection-Based Persistent Digital Archives - Part 2
SOW
DC "The technology proposed by SDSC for implementing persistent archives builds upon interactions with many of these groups. Explicit interactions include collaborations with Federal planning groups, the Computational Grid, the digital library community, and individual federal agencies." ... "The data management technology has been developed through multiple federally sponsored projects, including the DARPA project F19628-95-C-0194 "Massive Data Analysis Systems," the DARPA/USPTO project F19628-96-C-0020 "Distributed Object Computation Testbed," the Data Intensive Computing thrust area of the NSF project ASC 96-19020 "National Partnership for Advanced Computational Infrastructure," the NASA Information Power Grid project, and the DOE ASCI/ASAP project "Data Visualization Corridor." Additional projects related to the NSF Digital Library Initiative Phase II and the California Digital Library at the University of California will also support the development of information management technology. This work was supported by a NARA extension to the DARPA/USPTO Distributed Object Computation Testbed, project F19628-96-C-0020."
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 2
"Collection-Based Persistent Digital Archives: Part 2" describes the creation of a one million message persistent E-mail collection. It discusses the four major components of a persistent archive system: support for ingestion, archival storage, information discovery, and presentation of the collection. The technology to support each of these processes is still rapidly evolving, and opportunities for further research are identified.
ISBN
1082-9873
Critical Arguements
CA "The multiple migration steps can be broadly classified into a definition phase and a loading phase. The definition phase is infrastructure independent, whereas the loading phase is geared towards materializing the processes needed for migrating the objects onto new technology. We illustrate these steps by providing a detailed description of the actual process used to ingest and load a million-record E-mail collection at the San Diego Supercomputer Center (SDSC). Note that the SDSC processes were written to use the available object-relational databases for organizing the meta-data. In the future, it may be possible to go directly to XML-based databases."
Phrases
<P1> The processes used to ingest a collection, transform it into an infrastructure independent form, and store the collection in an archive comprise the persistent storage steps of a persistent archive. The processes used to recreate the collection on new technology, optimize the database, and recreate the user interface comprise the retrieval steps of a persistent archive. <P2> In order to build a persistent collection, we consider a solution that "abstracts" all aspects of the data and its preservation. In this approach, data object and processes are codified by raising them above the machine/software dependent forms to an abstract format that can be used to recreate the object and the processes in any new desirable forms. <P3> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P4> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P5> The steps used to store the persistent archive were: (1) Define Digital Object: define meta-data, define object structure (OBJ-DTD) --- (A), define object DTD to object DDL mapping --- (B) (2) Define Collection: define meta-data, define collection structure (COLL-DTD) --- (C), define collection DTD structure to collection DDL mapping --- (D) (3) Define Containers: define packing format for encapsulating data and meta-data (examples are the AIP standard, Hierarchical Data Format, Document Type Definition) <P5> In the ingestion phase, the relational and semi-structured organization of the meta-data is defined. No database is actually created, only the mapping between the relational organization and the object DTD. <P6> Note that the collection relational organization does not have to encompass all of the attributes that are associated with a digital object. Separate information models are used to describe the objects and the collections. It is possible to take the same set of digital objects and form a new collection with a new relational organization. <P7> Multiple communities across academia, the federal government, and standards groups are exploring strategies for managing very large archives. The persistent archive community needs to maintain interactions with these communities to track development of new strategies for data management and storage. <warrant> <P8>
Conclusions
RQ "The four major components of the persistent archive system are support for ingestion, archival storage, information discovery, and presentation of the collection. The first two components focus on the ingestion of data into collections. The last two focus on access to the resulting collections. The technology to support each of these processes is still rapidly evolving. Hence consensus on standards has not been reached for many of the infrastructure components. At the same time, many of the components are active areas of research. To reach consensus on a feasible collection-based persistent archive, continued research and development is needed. Examples of the many related issues are listed below:
Type
Report
Title
Mapping of the Encoded Archival Description DTD Element Set to the CIDOC CRM
The CIDOC CRM is the first ontology designed to mediate contents in the area of material cultural heritage and beyond, and has been accepted by ISO TC46 as work item for an international standard. The EAD Document Type Definition (DTD) is a standard for encoding archival finding aids using the Standard Generalized Markup Language (SGML). Archival finding aids are detailed guides to primary source material which provide fuller information than that normally contained within cataloging records. 
Publisher
Institute of Computer Science, Foundation for Research and Technology - Hellas
Publication Location
Heraklion, Crete, Greece
Language
English
Critical Arguements
CA "This report describes the semantic mapping of the current EAD DTD Version 1.0 Element Set to the CIDOC CRM and its latest extension. This work represents a proof of concept for the functionality the CIDOC CRM is designed for." 
Conclusions
RQ "Actually, the CRM seems to do the job quite well ÔÇô problems in the mapping arise more from underspecification in the EAD rather than from too domain-specific notions. "┬á... "To our opinion, the archival community could benefit from the conceptualizations of the CRM to motivate more powerful metadata standards with wide interoperability in the future, to the benefit of museums and other disciplines as well."
SOW
DC "As a potential international standard, the EAD DTD is maintained in the Network Development and MARC Standards Office of the Library of Congress in partnership with the Society of American Archivists." ... "The CIDOC Conceptual Reference Model (see [CRM1999], [Doerr99]), in the following only referred to as ┬½CRM┬╗, is outcome of an effort of the Documentation Standards Group of the CIDOC Committee (see ┬½http:/www.cidoc.icom.org┬╗, ÔÇ£http://cidoc.ics.forth.grÔÇØ) of ICOM, the International Council of Museums beginning in 1996."
Type
Report
Title
Management of Electronic Records PROS 99/007 (Version 2)
This document is the Victorian Electronic Records Strategy (VERS) Standard (PROS 99/007). This document is the standard itself and is primarly concerned with conformance. The technical requirements of the Standard are contained in five Specifications.
Accessed Date
August 24, 2005
Critical Arguements
CA VERS has two major goals: the preservation of electronic records and enabling efficient management in doing so. Version 2 has an improved structure, additional metadata elements, requirements for preservation and compliance requirements for agencies. "Export" compliance allows agencies to maintain their records within their own recordkeeping systems and add a module so they can generate the VERS format for export, especially for long term preservation. "Native" complicance is when records are converted to long term preservation format upon registration which is seen as the ideal approach. ... "The Victorian Electronic Records Strategy (VERS) is designed to assist agencies in managing their electronic records. The strategy focuses on the data or information contained in electronic records, rather than the systems that are used to produce them."
SOW
<DC> "VERS was developed with the assistance of CSIRO, Ernst & Young, the Department of Infrastructure, and records managers across government. The recommendations included in the VERS Final Report1 issued in March 1999 provide a framework for the management of electronic records." ... "Public Records Office Victoria is the Archives of the State of Victoria. They hold the records from the beginnings of the colonial administration of Victoria in the mid-1830s to today.
Type
Web Page
Title
Archiving The Avant Garde: Documenting And Preserving Variable Media Art.
Archiving the Avant Garde is a collaborative project to develop, document, and disseminate strategies for describing and preserving non-traditional, intermedia, and variable media art forms, such as performance, installation, conceptual, and digital art. This joint project builds on existing relationships and the previous work of its founding partners in this area. One example of such work is the Conceptual & Intermedia Arts Online (CIAO) Consortium, a collaboration founded by the BAM/PFA, the Walker Art Center, and Franklin Furnace, that includes 12 other international museums and arts organizations. CIAO develops standardized methods of documenting and providing access to conceptual and other ephemeral intermedia art forms. Another example of related work conducted by the project's partners is the Variable Media Initiative, organized by the Guggenheim Museum, which encourages artists to define their work independently from medium so that the work can be translated once its current medium is obsolete. Archiving the Avant Garde will take the ideas developed in previous efforts and develop them into community-wide working strategies by testing them on specific works of art in the practical working environments of museums and arts organizations. The final project report will outline a comprehensive strategy and model for documenting and preserving variable media works, based on case studies to illustrate practical examples, but always emphasizing the generalized strategy behind the rule. This report will be informed by specific and practical institutional practice, but we believe that the ultimate model developed by the project should be based on international standards independent of any one organization's practice, thus making it adaptable to many organizations. Dissemination of the report, discussed in detail below, will be ongoing and widespread.
Critical Arguements
CA "Works of variable media art, such as performance, installation, conceptual, and digital art, represent some of the most compelling and significant artistic creation of our time. These works are key to understanding contemporary art practice and scholarship, but because of their ephemeral, technical, multimedia, or otherwise variable natures, they also present significant obstacles to accurate documentation, access, and preservation. The works were in many cases created to challenge traditional methods of art description and preservation, but now, lacking such description, they often comprise the more obscure aspects of institutional collections, virtually inaccessible to present day researchers. Without strategies for cataloging and preservation, many of these vital works will eventually be lost to art history. Description of and access to art collections promote new scholarship and artistic production. By developing ways to catalog and preserve these collections, we will both provide current and future generations the opportunity to learn from and be inspired by the works and ensure the perpetuation and accuracy of art historical records. It is to achieve these goals that we are initiating the consortium project Archiving the Avant Garde: Documenting and Preserving Variable Media Art."
Conclusions
RQ "Archiving the Avant Garde will take a practical approach to solving problems in order to ensure the feasibility and success of the project. This project will focus on key issues previously identified by the partners and will leave other parts of the puzzle to be solved by other initiatives and projects in regular communication with this group. For instance, this project realizes that the arts community will need to develop software tools which enable collections care professionals to implement the necessary new description and metadata standards, but does not attempt to develop such tools in the context of this project. Rather, such tools are already being developed by a separate project under MOAC. Archiving the Avant Garde will share information with that project and benefit from that work. Similarly, the prospect of developing full-fledged software emulators is one best solved by a team of computer scientists, who will work closely with members of the proposed project to cross-fertilize methods and share results. Importantly, while this project is focused on immediate goals, the overall collaboration between the partner organizations and their various initiatives will be significant in bringing together the computer science, arts, standards, and museum communities in an open-source project model to maximize collective efforts and see that the benefits extend far and wide."
SOW
DC "We propose a collaborative project that will begin to establish such professional best practice. The collaboration, consisting of the Berkeley Art Museum and Pacific Film Archive (BAM/PFA), the Solomon R. Guggenheim Museum, Rhizome.org, the Franklin Furnace Archive, and the Cleveland Performance Art Festival and Archive, will have national impact due to the urgent and universal nature of the problem for contemporary art institutions, the practicality and adaptability of the model developed by this group, and the significant expertise that this nationwide consortium will bring to bear in the area of documenting and preserving variable media art." ... "We believe that a model informed by and tested in such diverse settings, with broad public and professional input (described below), will be highly adaptable." ..."Partners also represent a geographic and national spread, from East Coast to Midwest to West Coast. This coverage ensures that a wide segment of the professional community and public will have opportunities to participate in public forums, hosted at partner institutions during the course of the project, intended to gather an even broader cross-section of ideas and feedback than is represented by the partners." ... "The management plan for this project will be highly decentralized ensuring that no one person or institution will unduly influence the model strategy for preserving variable media art and thereby reduce its adaptability."
CA Discussion of the challenges faced by librarians and archivists who must determine which and how much of the mass amounts of digitally recorded sound materials to preserve. Identifies various types of digital sound formats and the varying standards to which they are created. Specific challenges discussed include copyright issues; technologies and platforms; digitization and preservation; and metadata and other standards.
Conclusions
RQ "Whether between record companies and archives or with others, some type of collaborative approach to audio preservation will be necessary if significant numbers of audio recordings at risk are to be preserved for posterity. ... One particular risk of preservation programs now is redundancy. ... Inadequate cataloging is a serious impediment to preservation efforts. ... It would be useful to archives, and possibly to intellectual property holders as well, if archives could use existing industry data for the bibliographic control of published recordings and detailed listings of the music recorded on each disc or tape. ... Greater collaboration between libraries and the sound recording industry could result in more comprehensive catalogs that document recording sessions with greater specificity. With access to detailed and authoritative information about the universe of published sound recordings, libraries could devote more resources to surveying their unpublished holdings and collaborate on the construction of a preservation registry to help reduce preservation redundancy. ... Many archivists believe that adequate funding for preservation will not be forthcoming unless and until the recordings preserved can be heard more easily by the public. ... If audio recordings that do not have mass appeal are to be preserved, that responsibility will probably fall to libraries and archives. Within a partnership between archives and intellectual property owners, archives might assume responsibility for preserving less commercial music in return for the ability to share files of preserved historical recordings."
Type
Web Page
Title
CDL Digital Object Standard: Metadata, Content and Encoding
This document addresses the standards for digital object collections for the California Digital Library 1. Adherence to these standards is required for all CDL contributors and may also serve University of California staff as guidelines for digital object creation and presentation. These standards are not intended to address all of the administrative, operational, and technical issues surrounding the creation of digital object collections.
Critical Arguements
CA These standards describe the file formats, storage and access standards for digital objects created by or incorporated into the CDL as part of the permanent collections. They attempt to balance adherence to industry standards, reproduction quality, access, potential longevity and cost.
Conclusions
RQ not applicable
SOW
DC "This is the first version of the CDL Digital Object Standard. This version is based upon the September 1, 1999 version of the CDL's Digital Image Standard, which included recommendations of the Museum Educational Site Licensing Project (MESL), the Library of Congress and the MOA II participants." ... "The Museum Educational Site Licensing Project (MESL) offered a framework for seven collecting institutions, primarily museums, and seven universities to experiment with new ways to distribute visual information--both images and related textual materials. " ... "The Making of America (MoA II) Testbed Project is a Digital Library Federation (DLF) coordinated, multi-phase endeavor to investigate important issues in the creation of an integrated, but distributed, digital library of archival materials (i.e., digitized surrogates of primary source materials found in archives and special collections). The participants include Cornell University, New York Public Library, Pennsylvania State University, Stanford University and UC Berkeley. The Library of Congress white papers and standards are based on the experience gained during the American Memory Pilot Project. The concepts discussed and the principles developed still guide the Library's digital conversion efforts, although they are under revision to accomodate the capabilities of new technologies and new digital formats." ... "The CDL Technical Architecture and Standards Workgroup includes the following members with extensive experience with digital object collection and management: Howard Besser, MESL and MOA II digital imaging testbed projects; Diane Bisom, University of California, Irvine; Bernie Hurley, MOA II, University of California, Berkeley; Greg Janee, Alexandria Digital Library; John Kunze, University of California, San Francisco; Reagan Moore and Chaitanya Baru, San Diego Supercomputer Center, ongoing research with the National Archives and Records Administration on the long term storage and retrieval of digital content; Terry Ryan, University of California, Los Angeles; David Walker, California Digital Library"
Type
Web Page
Title
Update on the National Digital Infrastructure Initiative
CA Describes progress on a five-year national strategy for preserving digital content.
Conclusions
RQ "These sessions helped us set priorities. Participants agreed about the need for a national preservation strategy. People from industry were receptive to the idea that the public good, as well as their own interests, would be served by coming together to think about long-term preservation. <warrant> They also agreed on the need for some form of distributor-decentralized solution. Like others, they realize that no library can tackle the digital preservation challenge alone. Many parties will need to come together. Participants agreed about the need for digital preservation research, a clearer agenda, a better focus, and a greater appreciation that technology is not necessarily the prime focus. The big challenge might be organizational architecture, i.e., roles and responsibilities. Who is going to do what? How will we reach agreement?"
Type
Web Page
Title
PBCore: Public Broadcasting Metadata Dictionary Project
CA "PBCore is designed to provide -- for television, radio and Web activities -- a standard way of describing and using media (video, audio, text, images, rich interactive learning objects). It allows content to be more easily retrieved and shared among colleagues, software systems, institutions, community and production partners, private citizens, and educators. It can also be used as a guide for the onset of an archival or asset management process at an individual station or institution. ... The Public Broadcasting Metadata Dictionary (PBCore) is: a core set of terms and descriptors (elements) used to create information (metadata) that categorizes or describes media items (sometimes called assets or resources)."
Conclusions
<RQ> The PBCore Metadata Elements are currently in their first published edition, Version 1.0. Over two years of research and lively discussions have generated this version. ... As various users and communities begin to implement the PBCore, updates and refinements to the PBCore are likely to occur. Any changes will be clearly identified, ramifications outlined, and published to our constituents.
SOW
DC "Initial development funding for PBCore was provided by the Corporation for Public Broadcasting. The PBCore is built on the foundation of the Dublin Core (ISO 15836) ... and has been reviewed by the Dublin Core Metadata Initiative Usage Board. ... PBCore was successfully deployed in a number of test implementations in May 2004 in coordination with WGBH, Minnesota Public Radio, PBS, National Public Radio, Kentucky Educational Television, and recognized metadata expert Grace Agnew. As of July 2004 in response to consistent feedback to make metadata standards easy to use, the number of metadata elements was reduced to 48 from the original set of 58 developed by the Metadata Dictionary Team. Also, efforts are ongoing to provide more focused metadata examples that are specific to TV and radio. ... Available free of charge to public broadcasting stations, distributors, vendors, and partners, version 1.0 of PBCore was launched in the first quarter of 2005. See our Licensing Agreement via the Creative Commons for further information. ... Plans are under way to designate an Authority/Maintenance Organization."
The creation and use of metadata is likely to become an important part of all digital preservation strategies whether they are based on hardware and software conservation, emulation or migration. The UK Cedars project aims to promote awareness of the importance of digital preservation, to produce strategic frameworks for digital collection management policies and to promote methods appropriate for long-term preservation - including the creation of appropriate metadata. Preservation metadata is a specialised form of administrative metadata that can be used as a means of storing the technical information that supports the preservation of digital objects. In addition, it can be used to record migration and emulation strategies, to help ensure authenticity, to note rights management and collection management data and also will need to interact with resource discovery metadata. The Cedars project is attempting to investigate some of these issues and will provide some demonstrator systems to test them.
Notes
This article was presented at the Joint RLG and NPO Preservation Conference: Guidelines for Digital Imaging, held September 28-30, 1998.
Critical Arguements
CA "Cedars is a project that aims to address strategic, methodological and practical issues relating to digital preservation (Day 1998a). A key outcome of the project will be to improve awareness of digital preservation issues, especially within the UK higher education sector. Attempts will be made to identify and disseminate: Strategies for collection management ; Strategies for long-term preservation. These strategies will need to be appropriate to a variety of resources in library collections. The project will also include the development of demonstrators to test the technical and organisational feasibility of the chosen preservation strategies. One strand of this work relates to the identification of preservation metadata and a metadata implementation that can be tested in the demonstrators." ... "The Cedars Access Issues Working Group has produced a preliminary study of preservation metadata and the issues that surround it (Day 1998b). This study describes some digital preservation initiatives and models with relation to the Cedars project and will be used as a basis for the development of a preservation metadata implementation in the project. The remainder of this paper will describe some of the metadata approaches found in these initiatives."
Conclusions
RQ "The Cedars project is interested in helping to develop suitable collection management policies for research libraries." ... "The definition and implementation of preservation metadata systems is going to be an important part of the work of custodial organisations in the digital environment."
SOW
DC "The Cedars (CURL exemplars in digital archives) project is funded by the Joint Information Systems Committee (JISC) of the UK higher education funding councils under Phase III of its Electronic Libraries (eLib) Programme. The project is administered through the Consortium of University Research Libraries (CURL) with lead sites based at the Universities of Cambridge, Leeds and Oxford."
Type
Web Page
Title
Metadata for preservation : CEDARS project document AIW01
This report is a review of metadata formats and initiatives in the specific area of digital preservation. It supplements the DESIRE Review of metadata (Dempsey et al. 1997). It is based on a literature review and information picked-up at a number of workshops and meetings and is an attempt to briefly describe the state of the art in the area of metadata for digital preservation.
Critical Arguements
CA "The projects, initiatives and formats reviewed in this report show that much work remains to be done. . . . The adoption of persistent and unique identifiers is vital, both in the CEDARS project and outside. Many of these initiatives mention "wrappers", "containers" and "frameworks". Some thought should be given to how metadata should be integrated with data content in CEDARS. Authenticity (or intellectual preservation) is going to be important. It will be interesting to investigate whether some archivists' concerns with custody or "distributed custody" will have relevance to CEDARS."
Conclusions
RQ Which standards and initiatives described in this document have proved viable preservation metadata models?
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
Type
Web Page
Title
Approaches towards the Long Term Preservation of Archival Digital Records
The Digital Preservation Testbed is carrying out experiments according to pre-defined research questions to establish the best preservation approach or combination of approaches. The Testbed will be focusing its attention on three different digital preservation approaches - Migration; Emulation; and XML - evaluating the effectiveness of these approaches, their limitations, costs, risks, uses, and resource requirements.
Language
English; Dutch
Critical Arguements
CA "The main problem surrounding the preservation of authentic electronic records is that of technology obsolescence. As changes in technology continue to increase exponentially, the problem arises of what to do with records that were created using old and now obsolete hardware and software. Unless action is taken now, there is no guarantee that the current computing environment (and thus also records) will be accessible and readable by future computing environments."
Conclusions
RQ "The Testbed will be conducting research to discover if there is an inviolable way to associate metadata with records and to assess the limitations such an approach may incur. We are also working on the provision of a proposed set of preservation metadata that will contain information about the preservation approach taken and any specific authenticity requirements."
SOW
DC The Digital Preservation Testbed is part of the non-profit organisation ICTU. ICTU is the Dutch organisation for ICT and government. ICTU's goal is to contribute to the structural development of e-government. This will result in improving the work processes of government organisations, their service to the community and interaction with the citizens. Government institutions, such as Ministries, design the policies in the area of e-government, and ICTU translates these policies into projects. In many cases, more than one institution is involved in a single project. They are the principals in the projects and retain control concerning the focus of the project. In case of the Digital Preservation Testbed the principals are the Ministry of the Interior and the Dutch National Archives.
This paper discusses how metadata standards can help organizations comply with the ISO 9000 standards for quality systems. It provides a brief overview of metadata, ISO 9000 and related records management standards. It then analyses in some depth the ISO 9000 requirements for quality records, and outlines the problems that some organizations have in complying with them. It also describes the metadata specifications developed by the University of Pittsburgh Electronic Recordkeeping project and the SPIRT Recordkeeping Metadata project in Australia and discusses the role of metadata in meeting ISO 9000 requirements for the creation and preservation of reliable, authentic and accessible records.
Publisher
Records Continuum Research Group
Critical Arguements
CA "During the last few years a number of research projects have studied the types of metadata needed to create, manage and make accessible quality records, i.e. reliable, authentic and useable records. This paper will briefly discuss the purposes of recordkeeping metadata, with reference to emerging records management standards, and the models presented by two projects, one in the United States and one in Australia. It will also briefly review the ISO 9000 requirements for records and illustrate how metadata can help an organization meet these requirements."
Conclusions
RQ "Quality records provide many advantages for organizations and can help companies meet the ISO 9000 certification. However, systems must be designed to create the appropriate metadata to ensure they comply with recordkeeping requirements, particularly those identified by records management standards like AS 4390 and the proposed international standard, which provide benchmarks for recordkeeping best practice. The Pittsburgh metadata model and the SPIRT framework provide organizations with standardized sets of metadata that would ensure the creation, preservation and accessibility of reliable, authentic and meaningful records for as long as they are of use. In deciding what metadata to capture, organisations should consider the cost of meeting the requirements of the ISO 9000 guidelines and any related records management best practice standards, and the possible risk of not meeting these requirements."
Type
Web Page
Title
Softening the borderlines of archives through XML - a case study
Archives have always had troubles getting metadata in formats they can process. With XML, these problems are lessening. Many applications today provide the option of exporting data into an application-defined XML format that can easily be post-processed using XSLT, schema mappers, etc, to fit the archives┬┤ needs. This paper highlights two practical examples for the use of XML in the Swiss Federal Archives and discusses advantages and disadvantages of XML in these examples. The first use of XML is the import of existing metadata describing debates at the Swiss parliament whereas the second concerns preservation of metadata in the archiving of relational databases. We have found that the use of XML for metadata encoding is beneficial for the archives, especially for its ease of editing, built-in validation and ease of transformation.
Notes
The Swiss Federal Archives defines the norms and basis of records management and advises departments of the Federal Administration on their implementation. http://www.bar.admin.ch/bar/engine/ShowPage?pageName=ueberlieferung_aktenfuehrung.jsp
Critical Arguements
CA "This paper briefly discusses possible uses of XML in an archival context and the policies of the Swiss Federal Archives concerning this use (Section 2), provides a rough overview of the applications we have that use XML (Section 3) and the experiences we made (Section 4)."
Conclusions
RQ "The systems described above are now just being deployed into real world use, so the experiences presented here are drawn from the development process and preliminary testing. No hard facts in testing the sustainability of XML could be gathered, as the test is time itself. This test will be passed when we can still access the data stored today, including all metadata, in ten or twenty years." ... "The main problem area with our applications was the encoding of the XML documents and the non-standard XML document generation of some applications. When dealing with the different encodings (UTF-8, UTF-16, ISO-8859-1, etc) some applications purported a different encoding in the header of the XML document than the true encoding of the document. These errors were quickly identified, as no application was able to read the documents."
SOW
DC The author is currently a private digital archives consultant, but at the time of this article, was a data architect for the Swiss Federal Archives. The content of this article owes much to the work being done by a team of architects and engineers at the Archives, who are working on an e-government project called ARELDA (Archiving of Electronic Data and Records).
Type
Web Page
Title
Creating and Documenting Text: A Guide to Good Practice
CA "The aim of this Guide is to take users through the basic steps involved in creating and documenting an electronic text or similar digital resource. ... This Guide assumes that the creators of electronic texts have a number of common concerns. For example, that they wish their efforts to remain viable and usable in the long-term, and not to be unduly constrained by the limitations of current hardware and software. Similarly, that they wish others to be able to reuse their work, for the purposes of secondary analysis, extension, or adaptation. They also want the tools, techniques, and standards that they adopt to enable them to capture those aspects of any non-electronic sources which they consider to be significant -- whilst at the same time being practical and cost-effective to implement."
Conclusions
RQ "While a single metadata scheme, adopted and implemented wholescale would be the ideal, it is probable that a proliferation of metadata schemes will emerge and be used by different communities. This makes the current work centred on integrated services and interoperability all the more important. ... The Warwick Framework (http://www.ukoln.ac.uk/metadata/resources/wf.html) for example suggests the concept of a container architecture, which can support the coexistence of several independently developed and maintained metadata packages which may serve other functions (rights management, administrative metadata, etc.). Rather than attempt to provide a metadata scheme for all web resources, the Warwick Framework uses the Dublin Core as a starting point, but allows individual communities to extend this to fit their own subject-specific requirements. This movement towards a more decentralised, modular and community-based solution, where the 'communities of expertise' themselves create the metadata they need has much to offer. In the UK, various funded organisations such as the AHDS (http://ahds.ac.uk/), and projects like ROADS (http://www.ilrt.bris.ac.uk/roads/) and DESIRE (http://www.desire.org/) are all involved in assisting the development of subject-based information gateways that provide metadata-based services tailored to the needs of particular user communities."
During the past decade, the recordkeeping practices in public and private organizations have been revolutionized. New information technologies from mainframes, to PC's, to local area networks and the Internet have transformed the way state agencies create, use, disseminate, and store information. These new technologies offer a vastly enhanced means of collecting information for and about citizens, communicating within state government and between state agencies and the public, and documenting the business of government. Like other modern organizations, Ohio state agencies face challenges in managing and preserving their records because records are increasingly generated and stored in computer-based information systems. The Ohio Historical Society serves as the official State Archives with responsibility to assist state and local agencies in the preservation of records with enduring value. The Office of the State Records Administrator within the Department of Administrative Services (DAS) provides advice to state agencies on the proper management and disposition of government records. Out of concern over its ability to preserve electronic records with enduring value and assist agencies with electronic records issues, the State Archives has adapted these guidelines from guidelines created by the Kansas State Historical Society. The Kansas State Historical Society, through the Kansas State Historical Records Advisory Board, requested a program development grant from the National Historical Publications and Records Commission to develop policies and guidelines for electronic records management in the state of Kansas. With grant funds, the KSHS hired a consultant, Dr. Margaret Hedstrom, an Associate Professor in the School of Information, University of Michigan and formerly Chief of State Records Advisory Services at the New York State Archives and Records Administration, to draft guidelines that could be tested, revised, and then implemented in Kansas state government.
Notes
These guidelines are part of the ongoing effort to address the electronic records management needs of Ohio state government. As a result, this document continues to undergo changes. The first draft, written by Dr. Margaret Hedstrom, was completed in November of 1997 for the Kansas State Historical Society. That version was reorganized and updated and posted to the KSHS Web site on August 18, 1999. The Kansas Guidelines were modified for use in Ohio during September 2000
Critical Arguements
CA "This publication is about maintaining accountability and preserving important historical records in the electronic age. It is designed to provide guidance to users and managers of computer systems in Ohio government about: the problems associated with managing electronic records, special recordkeeping and accountability concerns that arise in the context of electronic government; archival strategies for the identification, management and preservation of electronic records with enduring value; identification and appropriate disposition of electronic records with short-term value, and
Type
Web Page
Title
Online Archive of California Best Practice Guidelines for Encoded Archival Description, Version 1.1
These guidelines were prepared by the OAC Working Group's Metadata Standards Subcommittee during the spring and summer of 2003. This version of the OAC BPG EAD draws substantially on the
Language
Anonymous
Type
Web Page
Title
Descriptive Metadata Guidelines for RLG Cultural Materials
To ensure that the digital collections submitted to RLG Cultural Materials can be discovered and understood, RLG has compiled these Descriptive Metadata Guidelines for contributors. While these guidelines reflect the needs of one particular service, they also represent a case study in information sharing across community and national boundaries. RLG Cultural Materials engages a wide range of contributors with different local practices and institutional priorities. Since it is impossible to find -- and impractical to impose -- one universally applicable standard as a submission format, RLG encourages contributors to follow the suite of standards applicable to their particular community (p.1).
Critical Arguements
CA "These guidelines . . . do not set a new standard for metadata submission, but rather support a baseline that can be met by any number of strategies, enabling participating institutions to leverage their local descriptions. These guidelines also highlight the types of metadata that enhance functionality for RLG Cultural Materials. After a contributor submits a collection, RLG maps that description into the RLG Cultural Materials database using the RLG Cultural Materials data model. This ensures that metadata from the various participant communities is integrated for efficient searching and retrieval" (p.1).
Conclusions
RQ Not applicable.
SOW
DC RLG comprises more than 150 research and cultural memory institutions, and RLG Cultural Materials elicits contributions from countless museums, archives, and libraries from around the world that, although they might retain local descriptive standards and metadata schemas, must conform to the baseline standards prescribed in this document in order to integrate into RLG Cultural Materials. Appendix A represents and evaluates the most common metadata standards with which RLG Cultural Materians is able to work.
Type
Web Page
Title
Archiving of Electronic Digital Data and Records in the Swiss Federal Archives (ARELDA): e-government project ARELDA - Management Summary
The goal of the ARELDA project is to find long-term solutions for the archiving of digital records in the Swiss Federal Archives. This includes the accession, the long-term storage, preservation of data, description, and access for the users of the Swiss Federal Archives. It is also coordinated with the basic efforts of the Federal Archives to realize a uniform records management solution in the federal administration and therefore to support the pre-archival creation of documents of archival value for the benefits of the administration as well as of the Federal Archives. The project is indispensable for the long-term execution of the Federal Archives Act; Older IT systems are being replaced by newer ones. A complete migration of the data is sometimes not possible or too expensive; A constant increase of small database applications, built and maintained by people with no IT background; More and more administrative bodies are introducing records and document management systems.
Publisher
Swiss Federal Archives
Publication Location
Bern
Critical Arguements
CA "Archiving in general is a necessary prerequisite for the reconstruction of governmental activities as well as for the principle of legal certainty. It enables citizens to understand governmental activities and ensures a democratic control of the federal administration. And finally are archives a prerequisite for the scientific research, especially in the social and historical fields and ensure the preservation of our cultural heritage. It plays a vital role for an ongoing and efficient records management. A necessary prerequisite for the Federal Archives in the era of the information society will be the system ARELDA (Archiving of Electronic Data and Records)."
Conclusions
RQ "Because of the lack of standard solutions and limited or lacking personal resources for an internal development effort, the realisation of ARELDA will have to be outsourced and the cooperation with the IT division and the Federal Office for Information Technology, Systems and Telecommunication must be intensified. The guidelines for the projects are as follows:
SOW
DC ARELDA is one of the five key projects in the Swiss government's e-government strategy.
Museums and the Online Archive of California (MOAC) builds on existing standards and their implementation guidelines provided by the Online Archive of California (OAC) and its parent organization, the California Digital Library (CDL). Setting project standards for MOAC consisted of interpreting existing OAC/CDL documents and adapting them to the projects specific needs, while at the same time maintaining compliance with OAC/CDL guidelines. The present overview over the MOAC technical standards references both the OAC/CDL umbrella document and the MOAC implementation / adaptation document at the beginning of each section, as well as related resources which provide more detail on project specifications.
Critical Arguements
CA The project implements specifications for digital image production, as well as three interlocking file exchange formats for delivering collections, digital images and their respective metadata. Encoded Archival Description (EAD) XML describes the hierarchy of a collection down to the item-level and traditionally serves for discovering both the collection and the individual items within it. For viewing multiple images associated with a single object record, MOAC utilizes Making of America 2 (MOA2) XML. MOA2 makes the images representing an item available to the viewer through a navigable table of contents; the display mimics the behavior of the analog item by e.g. allowing end-users to browse through the pages of an artist's book. Through the further extension of MOA2 with Text Encoding Initiative (TEI) Lite XML, not only does every single page of the book display in its correct order, but a transcription of its textual content also accompanies the digital images.
Conclusions
RQ "These two instances of fairly significant changes in the project's specifications may serve as a gentle reminder that despite its solid foundation in standards, the MOAC information architecture will continue to face the challenge of an ever-changing technical environment."
SOW
DC The author is Digital Media Developer at the UC Berkeley Art Museum & Pacific Film Archives, a member of the MOAC consortium.