Metadata

Tom Worthington FACS HLM

For the course Electronic Document Management

Metadata

The Oxford English Dictionary describes metadata as:

metadata n., a set of data that describes and gives information about other data...

[1968 Proc. IFIP 4th Congr.: Suppl. 10 I. 113/2 There are categories of information about each data set as a unit in a data set of data sets, which must be handled as a special meta data set.] 1987 Philos. Trans. Royal Soc. A. 322 373 The challenge is to accumulate data..from diverse sources, convert it to machine-readable form with a harmonized array of *metadata descriptors and present the resulting database(s) to the user. 1998 New Scientist 30 May 35/2 With XML, attaching metadata to a document is easy, at least in theory.

Oxford English Dictionary, (Online) Draft entry Dec. 2001, URL: http://dictionary.oed.com/cgi/entry/00307096/00307096se19

Metadata can be described simply as "Data about Data". As an example the "creator" of this document is "Tom Worthington". The data is "Tom Worthington" and the medadata is "creator".

Metadata provides standard data items to allow parties to communicate about their organisations, products, terms and conditions. The payment and the "money" itself consists of data in an agreed metadata format, in an electronic transaction. Without suitable metadata standards, e-commerce could not take place and "money" in our online financial systems would cease to exist.

Metadata can also be used to describe published documents. The use of metadata for e-commerce and for publishing has converged in the last few years with the use of the same XML technology for both applications.

Australian Government Metadata

<meta name="DC.Publisher" scheme="X500" content="ou=Australian Government Information Management Office (AGIMO) ; o= Commonwealth of Australia ; c=AU">
<meta name="DC.Description" content="The australia.gov.au website is your connection with government in Australia...">
<meta name="DC.Subject" scheme="TAGS" content="Government information; Federal government; Government services; Government publications; Web sites">
<meta name="DC.Type.documentType" scheme="agls-document" content="homepage">

From: "australia.gov.au : your connection with government", Australian Government Information Management Office, 2004-06-30, URL: http://www.australia.gov.au/

Tax Office e-commerce transaction

<FORM_PERIOD_LABEL_TEXT>July to September 2001</FORM_PERIOD_LABEL_TEXT>
<EFT_CODE> 51111 121 059 9059</EFT_CODE>
<BILLER_CODE>75556</BILLER_CODE>
<PAYG_WITHHOLDING>0</PAYG_WITHHOLDING>
<PAYG_INSTALMENT>12541</PAYG_INSTALMENT>
<DEFERRED_COMPANY_FUND_INSTALMENT>7879801 </DEFERRED_COMPANY_FUND_INSTALMENT>
<TOTAL_DEBITS>7892342</TOTAL_DEBITS>
<TOTAL_CREDITS>0</TOTAL_CREDITS>
<NET_AMOUNT_FOR_THIS_STATEMENT>7892342 </NET_AMOUNT_FOR_THIS_STATEMENT>
<GST_LABEL_TEXT>for the QUARTER from 1 Jul 2001 to 30 Sep 2001</GST_LABEL_TEXT>
<GST_ACCOUNTING_METHOD_LABEL_TEXT>Cash ...

From: Formatting the eBAS with XSL, Tom Worthington, 29 November 2002, URL: http://www.tomw.net.au/2002/atoxml.html

This is an example of an e-commerce transaction. This is an Australian Taxation Office electronic tax form for the Goods and Services Tax (GST).

Dublin Core

Title Typically, Title will be a name by which the resource is formally known.
Creator Examples of Creator include a person, an organization, or a service. ...
Subject ... keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.
Description ... an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content. ...

Adapted from "Dublin Core Metadata Element Set", Version 1.1: Reference Description, DCMI, 2003-06-02, URL: http://dublincore.org/documents/dces/

Dublin Core (DC)is a metadata standards project originating from a workshop held in Dublin, Ohio, USA in 1995. "Dublin Core" metadata element set is a small set of metadata definitions intended for cross-domain information resources. However, DC has its origins in the work of librarians and so tends to work better for describing printed text, than other items, such as video.

The intention with DC is to provide a brief standard set of essential metadata items for resources: Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, Rights.

Other examples of controlled vocabulary are using the Internet Media Types (MIME) for defining computer media formats in the format element and language tags, such as "en-AU" for Australian English.

AGLS

Element

Example

Function

<META NAME="AGLS.Function" CONTENT="School Education">

Availability

<META NAME="AGLS.Availability" CONTENT="Medical assistance is available by contacting the after hours hotline on ...">

Audience

<agls:audience>anglers</agls:audience>

Mandate

<META NAME="AGLS.Mandate.case" SCHEME="URI" CONTENT="http://...">

Complied from AGLS Metadata Element Set, Part 2: Usage Guide, Version 1.3 , National Archives of Australia, 2002, URL: http://www.naa.gov.au/recordkeeping/gov_online/agls/metadata_element_set.html

The Australian Government Locator Service (AGLS) metadata standard is a set of 19 descriptive elements to improve the visibility and accessibility of services and information over the Internet. The AGLS standard is based the 15 Dublin Core elements, plus four extra elements.

AGLS Mandatory Elements

  • Creator
  • Publisher (note: this element is not mandatory for descriptions of services)
  • Title
  • Date
  • Subject OR Function
  • Identifier OR Availability

From: AGLS Metadata Element Set, Part 2: Usage Guide, Version 1.3 , National Archives of Australia, 2002, URL: http://www.naa.gov.au/recordkeeping/gov_online/agls/metadata_element_set.html

No elements are mandatory for DC, but AGLS requires five (or six) of these.

Qualifiers

Qualifiers are additions and extensions to the metadata elements that give metadata creators the option to refine the semantics of the element set, and add precision to the values of the metadata elements. For example, it may be useful to indicate that the value has been selected from a particular controlled vocabulary, such as a list of keywords, or is encoded using a particular convention - the format for dates is an important case - or in a particular natural language.

From: AGLS Metadata Element Set, Part 2: Usage Guide, Version 1.3 , National Archives of Australia, 2002, URL: http://www.naa.gov.au/recordkeeping/gov_online/agls/metadata_element_set.html

Qualifiers are used to restrict the semantics of the relationship between the resource and the element value. AGLS encourages more use of qualifiers than DC, but does not require it.

AGLS Qualifiers

AGLS uses two types of qualifiers:

  1. Element refinements are represented in HTML <meta> syntax with qualifiers appended to to the element names. For example: "DC.Type.documentType".

  2. Encoding schemes indicate how the value is to be interpreted if it has been chosen from a controlled vocabulary, or externally defined standard. For example:

<META NAME="DC.Date.modified" SCHEME="ISO8601" CONTENT="1998-08-27">.

Metadata Tools

This is a demonstration of DSTC's Reg metadata editor. Reg allows you to:

  • enter metadata
  • export metadata in a number of syntaxes
  • save metadata records to a test repository
  • reload metadata records from a repository for editing

Reg uses metadata schemas to customize itself for different metadata element sets. ...

"Reg - Metadata Editor", DSTC Pty Ltd, 1998, 2000, URL: http://metadata.net/cgi-bin/reg/demo.cgi.

Metadata is rarely entered be the document author typing in text. When encoded in the header of a HTML document the metadata is not displayed by a web browser. Specialized software, such as a content management systems, or features in word processors are used to enter and display the metadata. The user of the system is likely to be unaware they are using a metadata standard or how it is encoded. Examples of how these systems will be shown later.

The Distributed Systems Technology Centre (DSTC Pty Ltd), has produced a metadata tool to create AGLS and Dublin Core metadata. Rege, can be used to generate AGLS metadata syntax. This would be too cumbersome for creating real metadata, but is a useful way to learn about the process.

More Information

Slides for these notes are also available.

Copyright © 2007 (version of 17 October 2007)Tom Worthington

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.