Digital Library and E-Publishing

Tom Worthington FACS HLM

For the course Electronic Document Management

Publishing Mistakes Are Dangerous

Sirs: Recently we found out that our abstract "Severe Tardive Dystonia: Treatment with Continuous Intrathecal Baclofen Administration" (J Neurol 243 Suppl 2: S75) contains a severe and potentially dangerous mistake.

The dose of intrathecal baclofen in the patient presented was 100 mg/day rather than 100 g/day. The abstract submitted as well as the computer disk (Microsoft Word for Windows Version 2.0b) additionally handed in for electronic publication contained the correct figure spelled with the Greek character "m".

Investigations into this subject revealed that occasionally special characters may be misinterpreted by different versions of the same wordprocessing programme ...

From: "Risks of electronic publishing", D. Dressler, page 61, Letters to the Editors, Journal of Neurology, Steinkopff Verlag , Volume 244, Number 1/November 28, 1996, URL: http://www.springerlink.com/openurl.asp?genre=article&eissn=1432-1459&volume=244&issue=1&spage=61

Publishing, even academic publishing, is a significant economic activity and can also have significant effects on the lives of the public. An example from articles on "electronic publishing".

The Digital Library allows access to electronic documents, while respecting the intellectual property rights of the author. Before the web, the distinction between internal organisation documents and external publishing was clear. With the advent of the web, these distinctions are disappearing and there is a tendency to use the same technology for creating and indexing internal documents and for external document publishing. However, the legal distinctions remain and business practice has not caught up with technological developments. Therefore "publishing" for the electronic library remains a separate and distinct activity.

Accessibility

Availability of information and services in electronic form via the web has the potential to provide equal access for people with a disability; and to provide access more broadly, more cheaply and more quickly than is otherwise possible using other formats. Examples of access are:

  • People who are blind or have vision impairments ...captioning or transcription...

  • Deaf people or people with hearing impairments ...

  • ... disability makes it difficult to handle or read paper ...

  • ... travel to or enter premises where the paper form of a document is available.

From: "World Wide Web Access: Disability Discrimination Act Advisory Notes", Version 3.2, August 2002, Human Rights and Equal Opportunity Commission, URL: http://www.hreoc.gov.au/disability_rights/standards/www_3/www_3.html

See also: 2008 Beijing Olympics Website).

The AGIMO publishing guidelines require level "A" conformance with the Web Content Accessibility Guidelines 1.0 (World Wide Web Consortium), as detailed in the Human Rights and Equal Opportunity Commission advisory notes on World Wide Web access, issued under the Disability Discrimination Act 1992 for the purpose of avoiding discrimination.

Library Metadata

The new Bibliotheca Alexandrina will be officially opened by Egyptian President Hosni Mubarak at a ceremony attended by other heads of state and top officials.

Based on the old Library of Alexandra, the most famous library of Ancient Times, this modern public study centre will be open to students, researchers and the general public. ...

From: " Inauguration of the Alexandria Library", UNESCO, 2002

Libraries, now provide web based search facilities which look similar to web search engines. They look like web search engines partly because web search engines evolved from concepts of libraries and partly because on-line library users are now used to web search interfaces.

It should be appreciated that libraries have been in the information business for longer than IT professionals. As an example the Library of Alexandria was destroyed by fire 2000 years ago, but opened again in 2003, with a web site.

On-line Public Access Catalog (OPAC)

Author Aristotle, 384-322 B.C.
Title Athenaion Politeia / Aristoteles; Edidit Mortimer Chambers.
Publisher Stuttgart : B.G. Teubner, 1994.
Call Number 089.81
Description xx, 84p., [4]p. of Plates : Plates ; 20cm.
Series Stmt Bibliotheca Scriptorum Graecorum et Romanorum Teubneriana ; No. 1113

From: "On-line Public Access Catalog (OPAC)", Bibliotheca Alexandrina, URL: http://www.bibalex.org/English/

Libraries are progressively changing from paper based to electronic systems, first for metadata and then for the information resources themselves.

MAchine-Readable Cataloging (MARC) Format

050 HV1559.A8B682 2000

100 1 Bourk, Michael J

245 10 Universal service? :|btelecommunications policy in

Australia and people with disabilities /|cMichael J Bourk

; edited by Tom Worthington

246 3 Telecommunications policy in Australia and people with

disabilities

260 Belconnen, A.C.T. :|bTomW Communications,|c2000

300 xiv, 273 p. ;|c21 cm

From: From: " ANU Full Database", ANU

The catalogue information can also be displayed in the MARC format, developed in the 1970s for "MAchine-Readable Cataloging"' by libraries. This format uses numeric codes to identify each metadata item.

MARC adapted to XML

<?xml version="1.0" encoding="UTF-8" ?>

<collection xmlns="http://www.loc.gov/MARC21/slim">

<record>

...

<datafield tag="245" ind1="1" ind2="0">

<subfield code="a">Arithmetic /</subfield>

<subfield code="c">Carl Sandburg ; illustrated as an anamorphic adventure by Ted Rand.</subfield>

</datafield>

...

</record>

</collection>

From: URL: http://www.loc.gov/standards/marcxml//Sandburg/sandburg.xml

As with other metadata formats, MARC is being adapted to XML formats.

MARC to Dublin Core

<?xml version="1.0" ?>

<dc xmlns="http://purl.org/dc/elements/1.1/">

<title>Arithmetic /</title>

<creator>Sandburg, Carl, 1878-1967.</creator>

<creator>Rand, Ted, ill.</creator>

<type />

<publisher>San Diego :Harcourt Brace Jovanovich,</publisher>

<date>c1993.</date>

<language>eng</language>

...

</dc>

From: URL: http://www.loc.gov/standards/marcxml//Sandburg/sandburgdc.xml
see: MARC 21 XML Schema, The Library of Congress, 2003, URL: http://www.loc.gov/standards/marcxml//

However, it is more likely this would be converted to Dublin Core format for use in non-library systems.

Open Archives Initiative

Digital Library Federation Encourages Use of Open Archives Initiative The Digital Library Federation (DLF) is supporting the development of a small number of Internet gateways through which users will access distributed digital library holdings as if they were part of a single uniform collection. The gateways will be built using the OAI Metadata Harvesting Protocol. DLF gateways will contribute to a practical evaluation of the OAI's harvesting technique and its application within libraries to encourage digital collection managers to expose metadata and build services.

From: Open Archives Initiative, URL: http://www.openarchives.org/, 2001

See: ACS Digital Library and Arrow Discovery Service.

Activities such as the Open Archives Initiative are attempting to construct a virtual library of material using distributed document archives and shared metadata:

Organisations now considering electronic publications strategies can consider an integrated approach using newer XML tools to create and maintain content.

While the formats for publishing have been controversial, progress has been made on the metadata for publishing systems. The ACS has produced a Digital Library system which provides DC metadata via services such as the Arrow Discovery Service.

XML Document Formats

A modified version of the Sun Microsystems developed Open Office format was adopted as an OASIS Standard on May 1, 2005. This "Open Document Format" (ODF) was adopted as an international standard ISO/IEC 26300:2006 in May 3, 2006.

See also: DIS 29500 OOXML

In September 2007 Standards Australia voted to abstain from the ISO/IEC JTC1 ballot to adopt the DIS 29500 OOXML (Microsoft's Office Open XML format) as an International Standard. and the vote was lost. Microsoft had provided a presentation in favour of the standard and IBM against.

Both ODF and OOXML suffer from being derived from legacy word processing packages. A better alternative would be to use XHTML 2 and new CSS standards.

PDF Format

PDF example

"The Future of Open Source Software", Bill Appelbe, JRPIT, Volume 35, No. 4, 2003, URL: http://www.acs.org.au/jrpit/JRPITVolumes/JRPIT35/JRPIT35.4.227.pdf

JRPIT PDF Detail

Detail from PDF example

Detail from "The Future of Open Source Software", Bill Appelbe, JRPIT, Volume 35, No. 4, 2003, URL: http://www.acs.org.au/jrpit/JRPITVolumes/JRPIT35/JRPIT35.4.227.pdf

PDF format is commonly used for e-publishing. XML is used as an intermediate format between Word Processing documents and PDF. But in this example, zooming in to be able to read the text results in lines dropping off right hand side of the screen. ODF XML is being used by NAA for long term document storage.

More Information

Slides for these notes are also available.

Copyright © 2007 (version of 17 October 2007)Tom Worthington

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.