CA "Ironically, electronic records systems make it both possible to more fully capture provenance than paper recrods systems did and at the same time make it more likely that provenance will be lost and that archives, even if they are preserved, will therefore lack evidential value. This paper explores the relationship between provenance and evidence and its implications for management of paper or electronic information systems." (p. 177)
Conclusions
"Electronic information systems, therefore, present at least two challenges to archivists. The first is that the designers of these systems may have chosen to document less contextual information than may be of interest to archivists when they designed the system. The second is that the data recorded in any given information system will, someday, need to be transferred to another system. ... [A]rchivists will need to return to fundamental archival principles to determine just what they really wanted to save anyway. ... It may be that archivists will be satisfied with the degree of evidential historicity they were able to achieve in paper based record systems, in which case there are very few barriers to implementing successful electronic based archival environments. Or archivists may decide that the fuller capability of tracking the actual participation of electronic data objects in organizational activities needs to be documented by archivally satisfactory information systems, in which case they will need to define those levels of evidential historicity that must be attained, and specify the systems requirements for such environments. ... At a meeting on electronic records management research issues sponsored by the National Historical Publications and Records Commission in January 1991, participants identified the concept of technological and economic plateaux in electronic data capture and archiving as an important arena for research ... Hopefully this research will produce information to help archivists make decisions regarding the amount of contextual information they can afford to capture and the requirements of systems designed to document context along with managing data content. ... I will not be surprised as we refine our concepts of evidential historicity to discover that the concept of provenance takes on even greater granularity." (p. 192-193)
Type
Journal
Title
Reality and Chimeras in the Preservation of Electronic Records
CA An emulation approach is not viable for e-records preservation because it preserves the "wrong thing": systems functionality rather than records. Consequently, an emulation solution would not preserve e-records as evidence "even if it could be made to work."
Phrases
<P1>Electronic records that are not moved out of obsolete hardware and software environments are very likely to die with them. <P2> Failure to examine in detail what makes an electronic record evidence over time has led Rothenberg, and many others, to assume they want to preserve system functionality. (p.2) <P3> The state of a database at any given moment is not a record. (p.2) <P4> If we want to preserve electronic records, what we really want are records of the actual inputs and outputs from the system to be maintained as evidence over time. This does not require the information system to function as it once did. All (!) it requires is that we can capture all transactions entering and leaving the system when they are created, ensuring that the original context of their creation and content is documented, and that the requirements of evidence are preserved over time. (p.2)
Conclusions
RQ Metadata encapsualtion strategies need to identify how metadata will be captured at the time of a record's creation, how it will be stored over time while supporting the use of the record by authorized users and more generally how the recordkeeping infrastructure will be constructed and maintained.
Type
Journal
Title
Capturing records' metadata: Unresolved questions and proposals for research
The author reviews a range of the research questions still unanswered by research on the capture of metadata required for recordness. These include how to maintain inviolable linkages between records and their metadata in a variety of architectures, what structure metadata content should take, the semantics of records metadata and that of other electronic sources, how new metadata can be acquired by records over time, maintaining the meaning of contextual metadata over time, the use of metadata in records management and the design of environments in which Business Acceptable Communications ÔÇô BAC ÔÇô (those with appropriate evidential metadata) can persist.
Critical Arguements
CA "My research consists of model building which enables the construction of theories and parallel implementations based on shared assumptions. Some of these models are now being tested in applications, so this report reflects both what we do not yet know from abstract constructs and questions being generated by field testing. " ... Bearman overviews research questions such as semantics, syntax, structure and persistence of metadata that still need to be addressed.
Phrases
<P1> Records are evidence when they are bound to appropriate metadata about their content, structure and context. <P2> The metadata required for evidence is described in the Reference Model for Business Acceptable Communications (BAC). <P3> Metadata which is required for evidence must continue to be associated with the record to which it relates over time and neither it nor the record content can be alterable. <P4> To date we have only identified three implementations which, logically, could allow metadata to retain this inviolable connection. Metadata can be: kept in a common envelope WITH a record (encapsulated), bound TO a record (by integrity controls within an environment), or LINKED with a record through a technical and/or social process (registration, key deposit, etc.). <P5> Metadata content was defined in order to satisfy a range of functional requirements of records, hence it ought to have a structure which enables it to serve these functions effectively and in concrete network implementations. <warrant> <P6> Clusters of metadata are must operate together. Clusters of metadata are required by different processes which take place at different times, for different software clients, and within a variety of processes. Distinct functions will need access to specified metadata substructures and must be able to act on these appropriately. Structures have been proposed in the Reference Model for Business Acceptable Communications. <P7> Metadata required for recordness must, logically, be standard; that required for administration of recordkeeping systems is extensible and locally variable. <P8> Records metadata must be semantically homogenous but it is probably desirable for it to be syntactically heterogeneous and for a range of protocols to operate against it. Records metadata management system requirements have both an internal and external aspect; internally they satisfy management requirements while externally they satisfy on-going recordness requirements. <P9> The metadata has to come either from a specific user/session or from rules defined to extract data either from a layer in the application or a layer between the application and the recording event. <P10> A representation of the business context must exist from which the record-creating event can obtain metadata values. <P11> Structural metadata must both define the dependent structures and identify them to a records management environment which is ÔÇ£patrollingÔÇØ for dependencies which are becoming risky in the evolving environment in order to identify needs for migration. <P12> BAC conformant environments could reduce overheads and, if standards supported the uniform management of records from the point of issue to the point of receipt. Could redundancy now imposed by both paper and electronic processes be dramatically reduced if records referenced other records? <P13>
Conclusions
RQ "All the proposed methods have some degree of external dependency. What are the implications software dependencies? Encapsulation, integrity controls and technico-social process are all software dependent. Is this avoidable? Can abstract reference models of the metadata captured by these methods serve to make them effectively software independent? " ... "What are the relative overhead costs of maintaining the systems which give adequate societal assurances of records retention following any of these approaches? Are there some strategies that are currently more efficient or effective? What are the organizational requirements for implementing metadata capture systems? In particular, what would the costs of building such systems within a single institution be versus the costs of implementing records metadata adhering communications servers on a universal scale?" ... "Can we model mechanisms to enable an integrated environment of recordkeeping throughout society for all electronically communicated transactions?" ... "Are the BAC structures workable? Complete? Extensible in ways that are known to be required? For example, metadata required for ÔÇ£recordnessÔÇØ is created at the time of the creation of the records but other metadata, as premised by the Warwick Framework, 2 may be created subsequently. Are these packets of metadata orthogonal with respect to recordness? If not, how are conflicts dealt with? " ... "Not all metadata references fixed facts. Thus, for example, we have premised that proper reference to a retention schedule is a citation to an external source rather than a date given within the metadata values of a record. Similar external references are required for administration of shifting access permissions. What role can registries (especially rights clearinghouses) play in a world of electronic records? How well do existing languages for permission management map to the requirements of records administration, privacy and confidentiality protection, security management, records retention and destruction, etc." ... "Not all records will be created with equally perfect metadata. Indeed risk-based decisions taken by organizations in structuring their recordsÔÇÖ capture are likely to result in conscious decisions to exclude certain evidential metadata. What are the implications of incomplete metadata on an individual organization level and on a societal level? Does the absence of data as a result of policy need to be noted? And if so, how?" ... "Since metadata has owners, howdo owners administer recordsÔÇÖ metadata over time? In particular, since records contain records, how are the layers of metadata exposed for management and administrative needs (if internal metadata documenting dependencies can slip through the migration process, we will end up with records that cannot serve as evidence. If protected records within unprotected records are not protected, we will end up with insecure records environments, etc. etc.)." ... "In principle, the BAC could be expressed as Dublin metadata 3 and insofar as it cannot be, the Dublin metadata will be inadequate for evidence. What other syntax could be used? How could these be comparatively tested?" .. "Could Dublin Core metadata, if extended by qualifying schema, serve the requirements of recordness? Records are, after all, documents in the Dublin sense of fixed information objects. What would the knowledge representation look like?" ... "Strategies for metadata capture currently locate the source of metadata either in the API layer, or the communications system, using data provided by the application (an analysis supports defining which data and where they can be obtained), from the user interface layer, or from the business rules defined for specified types of communication pathways. Can all the required metadata be obtained by some combination of these sources? In other words, can all the metadata be acquired from sources other than content created by the record-creator for the explicit and sole purpose of documentation (since such data is both suspect in itself and the demand for it is annoying to the end user)? " ... "Does the capture of metadata from the surrounding software layers require the implementation of a business-application specific engine, or can we design generic tools that provide the means by which even legacy computing systems can create evidential records if the communication process captures the interchange arising from a record-event and binds it with appropriate metadata?" ... "What kinds of representations of business processes and structures can best carry contextualizing metadata at this level of granularity and simultaneously serve end user requirements? Are the discovery and documentation representations of provenance going to have to be different? " ... "Can a generic level of representation of context be shared? Do standards such a STEP 4 provide adequate semantic rules to enable some meaningful exchange of business context information? " ... "Using past experiences of expired standards as an indicator, can the defined structural metadata support necessary migrations? Are the formal standards of the source and target environments adequate for actual record migration to occur?" ... "What metadata is required to document a migration itself?" ... "Reduction of redundancy requires record uses to impose post-creation metadata locks on records created with different retention and access controls. To what extent is the Warwick Framework relevant to these packets and can architectures be created to manage these without their costs exceeding the savings?" ... "A number of issues about proper implementation depend on the evolution (currently very rapid) of metadata strategies in the broader Internet community. Issues such as unique identification of records, external references for metadata values, models for metadata syntax, etc. cannot be resolved for records without reference to the ways in which the wider community is addressing them. Studies that are supported for metadata capture methods need to be aware of, and flexible in reference to, such developments."
The Getty Art History Information Program: Research Agenda for Cultural Heritage on Information Networks
Publication Year
1995
Critical Arguements
CA The inability to effectively preserve and authenticate electronic records presents a significant problem for the humanities research, which depends on correct attribution and the ability to view resources long after they were created.
Phrases
<P1> Current research on software dependence and interoperability is not largely driven by archival concerns and takes a relatively short view on the requirement to preserve functionality. Little research has been done on modeling the information loss that accompanies multiple migrations or the risks inherent in the use of commercial systems before standards are developed, yet these are the critical questions being posed by archives. (p.2) <P2> The metadata required for recordness and the means to capture this data and ensure that it is bonded to electronic communications is the most significant area for research in the near future. (p.3) <P3> Within organizations, archivists must find automatic means of identifying the business process for which a record is generated. Such data modeling will become increasingly critical in an era of ongoing business re-engineering. If records are retained for their evidential significance and for a period associated with risk, then certain knowledge of their functional source is essential to their rational control. If they are retained for long-term informational value, knowledge of context is necessary to understand their significance. (p.3) <warrant>
Conclusions
RQ We need to research what value e-records have other than as a means of assessing accountability. How are they used, and what value do users derive from them? What do we need to know about a record's content to support the discovery of billions of records? How can our preservation solutions be made scaleable?
CA Makes a distinction between archival description of the record at hand and documentation of the context of its creation. Argues the importance of the latter in establishing the evidentiary value of records, and criticizes ISAD(G) for its failure to account for context. "(1) The subject of documentation is, first and foremost, the activity that generated the records, the organizations and individuals who used the records, and the purposes to which the records were put. (2). The content of the documentation must support requirements for the archival management of records, and the representations of data should support life cycle management of records. (3) The requirements of users of archives, especially their personal methods of inquiry, should determine the data values in documentation systems and guide archivists in presenting abstract models of their systems to users." (p. 45-46)
Phrases
<P1> [T]he ICA Principles rationalize existing practice -- which the author believes as a practical matter we cannot afford; which fail to provide direct access for most archives users; and which do not support the day-to-day information requirements of archivists themselves. These alternatives are also advanced because of three, more theoretical, differences with the ICA Principles: (1) In focusing on description rather than documentation, they overlook the most salient characteristic of archival records: their status as evidence. (2) In proposing specific content, they are informed by the bibliographic tradition rather than by concrete analysis of the way in which information is used in archives. (3) In promoting data value standardization without identifying criteria or principles by which to identify appropriate language or structural links between the objects represented by such terms, they fail adequately to recognize that the data representation rules they propose reflect only one particular, and a limiting, implementation. (p. 33-34) <P2> Archives are themselves documentation; hence I speak here of "documenting documentation" as a process the objective of which is to construct a value-added representation of archives, by means of strategic information capture and recording into carefully structured data and information access systems, as a mechanism to satisfy the information needs of users including archivists. Documentation principles lead to methods and practices which involve archivists at the point, and often at the time, of records creation. In contrast, archival description, as described in the ICA Principles[,] is "concerned with the formal process of description after the archival material has been arranged and the units or entities to be described have been determined." (1.7) I believe documentation principles will be more effective, more efficient and provide archivists with a higher stature in their organizations than the post accessioning description principles proposed by the ICA. <warrant> (p. 34) <P3> In the United States, in any case, there is still no truly theoretical formulation of archival description principles that enjoys a widespread adherence, in spite of the acceptance of rules for description in certain concrete application contexts. (p. 37) <P4> [T]he MARC-AMC format and library bibliographic practices did not adequately reflect the importance of information concerning the people, corporate bodies and functions that generated records, and the MARC Authority format did not support appropriate recording of such contexts and relations. <warrant> (p. 37) <P5> The United States National Archives, even though it had contributed to the data dictionary which led to the MARC content designation, all the data which it believed in 1983 that it would want to interchange, rejected the use of MARC two years later because it did not contain elements of information required by NARA for interchange within its own information systems. <warrant> (p. 37) <P6> [A]rchivists failed to understand then, just as the ISAD(G) standard fails to do now, that rules for content and data representation make sense in the context of the purposes of actual exchanges or implementation, not in the abstract, and that different rules or standards for end-products may derive from the same principles. (p. 38) <P7> After the Committee on Archival Information Exchange of the Society of American Archivists was confronted with proposals to adopt many different vocabularies for a variety of different data elements, a group of archivists who were deeply involved in standards and description efforts within the SAA formed an Ad Hoc Working Group on Standards for Archival Description (WGSAD) to identify what types of standards were needed in order to promote better description practices.  WSAD concluded that existing standards were especially inadequate to guide practice in documenting contexts of creation.  Since then, considerable progress has been made in developing frameworks for documentation, archival information systems architecture and user requirements analysis, which have been identified as the three legs on which the documenting documentation platform rests. <warrant> (p. 38) <P8> Documentation of organizational activity ought to begin long before records are transferred to archives, and may take place even before any records are created -- at the time records are created -- at the time when new functions are assigned to an organization. (p. 39) <P9> It is possible to identify records which will be created and their retention requirements before they are created, because their evidential value and informational content are essentially predetermined. (p. 39) <P10> Archivists can actively intervene through regulation and guidance to ensure that the data content and values depicting activities and functions are represented in such a way that will make them useful for subsequent management and retrieval of the records resulting from these activities. This information, together with systems documentation, defines the immediate information system context out of which the records were generated, in which they are stored, and from which they were retrieved during their active life. (p. 39) <P11> Documentation of the link between data content and the context of creation and use of the records is essential if records (archives or manuscripts) are to have value as evidence. (p. 39) <P12> [C]ontextual documentation capabilities can be dramatically improved by having records managers actively intervene in systems design and implementation.  The benefits of proactive documentation of the context of records creation, however, are not limited to electronic records; the National Archives of Canada has recently revised its methods of scheduling to ensure that such information about important records systems and contexts of records creation will be documented earlier. <warrant> (p. 39) <P13> Documentation of functions and of information systems can be conducted using information created by the organization in the course of its own activity, and can be used to ensure the transfer of records to archives and/or their destruction at appropriate times. It ensures that data about records which were destroyed as well as those which were preserved will be kept, and it takes advantage of the greater knowledge of records and the purposes and methods of day-to-day activity that exist closer to the events. (p. 40) <P14> The facts of processing, exhibiting, citing, publishing and otherwise managing records becomes significant for their meaning as records, which is not true of library materials. (p. 41) <P15> [C]ontent and data representation requirements ought to be derived from analysis of the uses to which such systems must be put, and should satisfy the day to day information requirements of archivists who are the primary users of archives, and of researchers using archives for primary evidential purposes. (p. 41) <P16> The ICA Commission proposes a principle by which archivists would select data content for archival descriptions, which is that "the structure and content of representations of archival material should facilitate information retrieval." (5.1) Unfortunately, it does not help us to understand how the Commission selected the twenty-five elements of information identified as its standard, or how we could apply the principle to the selection of additional data content. It does, however, serve as a prelude to the question of which principles should guide archivists in choosing data values in their representations. (p. 42) <P17> Libraries have found that subject access based on titles, tables of contents, abstracts, indexes and similar formal subject analysis by-products of publishing can support most bibliographic research, but the perspectives brought to materials by archival researchers are both more varied and likely to differ from those of the records creators. (p. 43) <P18> The user should not only be able to employ a terminology and a perspective which are natural, but also should be able to enter the system with a knowledge of the world being documented, without knowing about the world of documentation. (p. 44) <P19> Users need to be able to enter the system through the historical context of activity, construct relations in that context, and then seek avenues down into the documentation. This frees them from trying to imagine what records might have survived -- documentation assists the user to establish the non-existence of records as well as their existence -- or to fathom how archivists might have described records which did survive. (p. 44) <P20> When they departed from the practices of Brooks and Schellenberg in order to develop means for the construction of union catalogues of archival holdings, American archivists were not defining new principles, but inventing a simple experiment. After several years of experience with the new system, serious criticisms of it were being leveled by the very people who had first devised it. (p. 45)
Conclusions
RQ "In short, documentation of the three aspects of records creation contexts (activities, organizations and their functions, and information systems), together with representation of their relations, is essential to the concept of archives as evidence and is therefore a fundamental theoretical principle for documenting documentation. Documentation is a process that captures information about an activity which is relevant to locating evidence of that activity, and captures information about records that are useful to their ongoing management by the archival repository. The primary source of information is the functions and information systems giving rise to the records, and the principal activity of the archivist is the manipulation of data for reference files that create richly-linked structures among attributes of the records-generating context, and which point to the underlying evidence or record." (p. 46)
Type
Electronic Journal
Title
Electronic Records Research: Working Meeting May 28-30, 1997
CA Archivists are specifically concerned with records that are not easy to document -- records that are full of secret, proprietary or sensitive information, not to mention hardware and software dependencies. This front end of recordmaking and keeping must be addressed as we define what electronic records are and are not, and how we are to deal with them.
Phrases
<P1> Driven by pragmatism, the University of Pittsburgh team looked for "warrant" in the sources considered authoritative by the practicioners of ancillary professions on whom archivists rely -- lawyers, auditors, IT personnel , etc. (p.3) <P2> If the record creating event and the requirements of 'recordness' are both known, focus shifts to capturing the metadata and binding it to the record contents. (p.7) <P3> A strong business case is still needed to justify the role of archivists in the creation of electronic record management systems. (p.10)
Conclusions
RQ Warrant needs to be looked at in different countries. Does the same core definition of what constitutes a record cut across state borders? What role do specific user needs play in complying to regulation and risk management?