Electronic Document Management and the Digital Library for E-commerce
Tom Worthington FACS
Visiting Fellow, Department of Computer Science, Australian National University, Canberra
For: Computing 3410 Students 2000, The Australian National University
This document is Version 2.0 1 August 2000: http://www.tomw.net.au/2000/edm.html
Notes for 2005 also available
This material was prepared for the unit Information Technology in Electronic Commerce (COMP3410) at the Australian National University, semester 2, 2000. Accompanying documents discuss The Eighteen Character Problem, Metadata and XML.
Electronic Document Management allows legally recognised documents used in e-commerce transactions to be created, transmitted and stored. Without electronic document management, fast and efficient e-commerce transactions would be buried under mounds of paper documenting the transactions, or be tied up in litigation over the authenticity of the electronic originals. The Digital Library has the potential to allow access to electronic documents, while respecting the intellectual property right of the author.
In 1995 a government committee, chaired by the author, made recommendations for electronic document management in Australian Government Agencies (OGO 1995):
Fully effective management of electronic documents requires consideration of an agency's total information environment. No single medium now holds all the documents relating to an agency's business activities. All sources should be managed in a co-ordinated way, in a manner appropriate to their environment, in order to preserve and provide access to business documents.
Electronic document management systems are more than just systems for tracking the location of electronic documents. Such systems should manage documents for their complete life cycle based on the value of the document to the agency's business. Just as there are standard procedures for the registration of paper documents and records, suitable procedures should be implemented to manage each electronic document throughout its life from creation to disposal...
Whatever strategy is adopted, the document management system must:
provide adequate context information for documents;
provide means to prove the authenticity of documents used as evidence;
provide for the disposal of records in conformance with the Archives Act 1983;
be robust against organisational or technological change;
provide levels of support for different types of document that accord with agency policy; and
provide links between paper and electronic documents.
The role of documents as evidence is emphasised:
All agencies must manage evidence. Evidence is the proof of how we acted. It is how we deal with our clients, customers, other agencies or bodies in the private sector, and how they deal with us. It is the basis from which we report to government and the voters. It is what we use to show we run our agencies efficiently and effectively. Above all it is what we use to discharge some obligation because we are held accountable for our actions...
It is not a new problem. The need for evidence has been around for a long time. What is changing is the way we keep the evidence. More and more the proof is moving from traditional paper documents to electronic media. This poses problems because traditional records management disciplines that have been applied to paper documents are not necessarily being applied to electronic documents. This can result in:
confusion between different versions of a document (e.g. because there may be multiple copies, none of which is the authoritative version);
loss or destruction of documents that should be kept (e.g. because there is no central repository analogous to the paper file repository, and the author is unaware of the need for retention);
questionable authenticity, because of possible manipulation of text in electronic documents;
loss of context of documents (e.g. because related documents are not linked or kept together); and
documents becoming inaccessible because of technological change (e.g. changes in software or storage media make the files unreadable).
Design Issues include:
For the information technology specialist, the problem is to translate these requirements into working systems. The current common approach to do this is to use:
Metadata in a text readable format (mostly supersets of Dublin Core) to describe the records: The metadata can be held with the record or separately.
Standard document formats to store and transport the documents. Implementations either use the original format the document was created in, a standardised format (such as XML or PDF) or multiple formats.
Security to identify and protect the integrity: using digital signautes.
Australian Standard AS 4390 is a national standard on records management. It was developed by the IT/21 Committee of Standards Australia, with representatives from professional associations, National Archives of Australia, government agencies and universities. The Standard was released in February 1996 and has six parts (Roberts 1998):
AS 4390.1-1996 Part 1: General
AS 4390.2-1996 Part 2: Responsibilities
AS 4390.3-1996 Part 3: Strategies
AS 4390.4-1996 Part 4: Control
AS 4390.5-1996 Part 5: Appraisal and Disposal
AS 4390.6-1996 Part 6: Storage.
Like other Australian standards, this is a voluntary code of practice. However, such standards are commonly adopted by government agencies. Companies are not required to comply with the standard, but need to satisfy regulated and courts that their records are well kept. The standard is intended to apply to the management of electronic records as as well as paper records. This is particularly important with electronic commerce, where there may be no paper records to present to a regulator or court as evidence of a business transaction. A court will need to be convinced that electronic records are well kept by an organisation for those records to be used in evidence.
A record is defined in the standard as:
'recorded information, in any form, including data in computer systems, created or received and maintained by an organisation or person in the transaction of business or the conduct of affairs and kept as evidence of such activity.' (AS 4390.1-1996: General, Clause 4.21).
Unfortunately, like most standards published by Standards Australia, AS 4390 is not freely available in electronic format. Therefore the standard tends to be mentioned in reports, rather than actually read or implemented.
Records Systems Requirements
Reliability - Records Systems, Procedures and Practices should work reliably to ensure that records are credible. Any systems managing records have to be capable of continuous and regular operation
Integrity - Control measures such as access monitoring, user verification, authorized destruction, etc should be implemented to prevent unauthorized destruction, alteration, or removal of records
Compliance - Records systems should be managed in compliance with all requirements arising from the current business, regulatory and accountability environment and community expectations in which the organization operates
Comprehensiveness- Records Systems should manage records resulting from the complete range of business activities for the organization in which they operate
Systematic - records should be created, maintained, and managed systematically.
Creation and Maintenance through the design and operation of records and business systems and,
Management through accurately documented policies, responsibilities and methodologies
Recordkeeping Metadata Standard for Commonwealth Agencies
In contrast to the Australian Records standard, the Recordkeeping Metadata Standard for Commonwealth Agencies (NAA 1999) is freely available in electronic format. It has similarities to the Australian Government Locator Service (AGLS) metadata standard from the same agency. AGLS's 19 descriptive elements are designed to improve the visibility and accessibility of services and information over the Internet by the general public. The Recordkeeping Metadata Standard is more complex with 20 elements (eight mandatory) and 65 sub-elements. It is designed for the recordkeeping systems used by Commonwealth government agencies, as used by records managers and staff:
Compliance with the Recordkeeping Metadata Standard for Commonwealth Agencies will help agencies to identify, authenticate, describe and manage their electronic records in a systematic and consistent way to meet business, accountability and archival requirements. The standard is designed to be used as a reference tool by agency corporate managers, IT personnel and software vendors involved in the design, selection and implementation of electronic recordkeeping and related information management systems. It defines a basic set of 20 metadata elements (eight of which constitute a core set of mandatory metadata) and 65 sub-elements that may be incorporated within such systems, and explains how they should be applied within the Commonwealth sphere.
Part One of the standard explains the purpose and importance of standardised recordkeeping metadata and details the scope, intended application and features of the standard. Features include: flexibility of application; repeatability of data elements; extensibility to allow for the management of agency-specific recordkeeping requirements; interoperability across systems environments; compatibility with related metadata standards, including the Australian Government Locator Service (AGLS) standard; and interdependency of metadata at the sub-element level.
Part Two of the standard provides full details on the 20 elements and 65 sub-elements, defining them in relation to their purpose and rationale. For each element and sub-element the standard provides an indication of applicability, obligation, conditions of use, assigned values and approved schemes. Where useful, elements and sub-elements are illustrated with examples.
Appended to the standard are tables of element and sub-element inter-relationships and interdependencies, and a Change Request Form for use by agencies and vendors wishing to request changes or additions to the standard.
Australian Recordkeeping Metadata Schema (RKMS)
The Australian Recordkeeping Metadata Schema (RKMS) is a product of research at Monash University, to provide:
a standardised set of structured recordkeeping metadata elements;
a framework for developing and specifying recordkeeping metadata standards;
a framework for reading or mapping metadata sets in ways which can enable their semantic interoperability by establishing equivalences and correspondences that can provide the basis for semi-automated translation between metadata schemas.
RKMS uses a "Simple Text Syntax", based on the proposed Dublin Core Structured Values scheme, although future work may use HTML or RDF syntax. However, the work does not appear to progressed to the point where it could be used for automated tools for mark-up of metadata or for translation between schemas.
Standard for the Management of Electronic Records in the Victorian Government
The Public Record Office of Victoria has issued PROS 99/007 Standard for the Management of Electronic Records (VERS) This is more prescriptive than other Australian efforts, covering:
VERS uses a superset of the National Archives of Australia (NAA) Recordkeeping metadata. VERS allows multiple encoding of one document and fixes the record at the time of creation using digital signatures. This requires new metadata to be kept separate from the document, or wrapped around the original record to form a new compound record. It also assumes that a particular digital signature will be readable over a long time and that the digital signature standards used will be supported in the long term.
VERS uses the Portable Document Format (PDF) Version 1.3 for its document standard. It is unfortunate that a standard had to be chosen before implementations of XML were available, which offer a better long term prospect.
Proposed NSW Recordkeeping Metadata Standard
NSW State Records is was expecting to release a recordkeeping metadata standard for the NSW public sector in July 2000. The draft provides a structure broadly similar to other Australian government, but less developed.
In some ways the lack of Australian standards for electronic document management, and in particular metadata, is disappointing. However, the approaches taken for metadata are based on the same Dublin Core foundation and differ only in detail. The standards for document format await the development of XML. This provides the opportunity for the computer science community to provide innovative ways to handle the complexity of the multiple formats in a technically unified and efficient manner.
Management, Appraisal and Preservation of Electronic Records
In its Management, Appraisal and Preservation of Electronic Records, the UK Public records office details a similar approach, including the use of a Dublin Core derived set of metadata, for UK Government records. However, there is not a specific metadata scheme specified and the organisation demonstrates a lack of commitment by not tagging its metadata document with metadata.
Guidelines for Commonwealth information published in electronic formats
Electronic document management provides a way to manage what are usually internal records in an organisation. A related task is publishing electronic information, and designing for electronic documents. Before the web, the distinction between internal organisation documents and external publishing was clear. With the advent of the web, these distinctions are disappearing and there is a tendency to use the same technology for creating and indexing internal documents and for external document. However, the legal distinctions remain and business practice has not caught up with technological developments. Therefore "publishing" for the elelctonic library remains a separate and distinct activity.
A good overview of publishing issues is provided in the Guidelines for Commonwealth information published in electronic formats (AusInfo 1999):
The introduction of digital technology allows information be stored in open formats from which a range of end products can be generated. Modern communications and computer technology allows transmission of one digital file from which several different user-formats can be generated. For example, a file meeting the latest specifications for Internet text can be viewed on a screen as text, displayed as Braille or run through a speech synthesiser and read aloud. These developments have profound importance in enabling the creation of documents that are accessible to a wide range of people. With a small amount of care at the outset, one document prepared in a standard format can meet a variety of needs, with the end-user taking responsibility for how the document is accessed.
The guidelines recommend the use of the AGLS metadata (as discussed in the accompanying document on Metadata). They also recommend use of the Human Rights and Equal Opportunity Commission advisory notes on World Wide Web access, issued under the Disability Discrimination Act 1992 for the purpose of avoiding discrimination:
Availability of information and services in electronic form via the web has the potential to provide equal access for people with a disability; and to provide access more broadly, more cheaply and more quickly than is otherwise possible using other formats. Examples of access are:
People who are blind or have vision impairments can use appropriate equipment and software to gain access to electronic documents in Braille, audio or large print form.
Deaf people or people with hearing impairments could have more ready access to captioning or transcription of sound material.
Many people whose disability makes it difficult to handle or read paper pages can use a computer, for example with a modified keyboard or with voice control.
Web publication may provide an effective means of access for people whose disability makes it difficult for them to travel to or enter premises where the paper form of a document is available.
Australian Digital Theses Program
The Australian Digital Theses Program aims to establish a distributed database of digital versions of theses produced by the postgraduate research students at Australian Universities. It uses PDF for document storage, which has severe limitations as an Electronic document format. Dublin Core metadata is automatically generated out of the ADT Deposit form. It is intended to use an e-commerce model to charging for printing/downloading of documents. The UNSW Online Payment System is used as an example of how this could be done.
Open eBook Publication Structure
The Open eBook Forum (OEBF) have published the Open eBook Publication Structure.This XML based format attempts to be expressive enough for paper publishing, while maintaining compatibility with web browsers. This XML format should also be useable with hand-held e-books, as well as being more efficient and suitable for on-screen reading that PDF.
Roberts (1998) The New Australian Records Management Standard, David Roberts, State Records Office, New South Wales URL: http://www.records.nsw.gov.au/publicsector/rk/sacramento/sacra_2.htm
McGee (1999) The International Records Management Standard - Implications for the Future - Speakers Notes, Don McGee, RMI Fall Seminar, 1999, URL: http://www.rmicanada.com/seminar/seminarnote_mcgee.htm#4.6 Records Systems Characteristics
OGO (1995): Improving Electronic Document Management: Guidelines for Australian Government Agencies, Office of Government Information Technology, 1995, URL: http://www.defence.gov.au/imsc/edmsc/iedmtc.htm
AusInfo (1999) Guidelines for Commonwealth information published in electronic formats, AusInfo, Commonwealth of Australia 1999, Revised Edition, January 2000, URL: http://www.ausinfo.gov.au/guidelines/
NAA (1999) Recordkeeping Metadata Standard for Commonwealth Agencies (version 1.0), National Archives of Australia, Commonwealth of Australia 1999, URL: http://www.naa.gov.au/recordkeeping/control/rkms/summary.htm