Metadata and Electronic Document Management

Metadata for Electronic Commerce

These are notes on website design for ANU course "Information Technology in Electronic Commerce" (COMP3410/COMP6341). This section of the course is prepared and presented by Tom Worthington FACS HLM, a Visiting Fellow in the Department of Computer Science at the Australian National University (and Director Tomw Communications Pty Ltd).

Differences Between Metadata for DBMS and E-commerce

Metadata for managing documents (as discussed in the previous section) tends to have a few dozen elements for each document. Most elements are text fields, rather than numeric values or qualified values. Metadata for electronic commerce uses more elements, more qualified and numeric values.


The United Nations agreed standards for world e-commerce called UN/EDIFACT:

26. United Nations rules for Electronic Data Interchange For Administration, Commerce and Transport. They comprise a set of internationally agreed standards, directories and guidelines for the electronic interchange of structured data, and in particular that related to trade in goods and services between independent, computerized information systems.

27. Recommended within the framework of the United Nations, the rules are approved and published by UN/ECE in the (this) United Nations Trade Data Interchange Directory (UNTDID) and are maintained under agreed procedures.

From: "UN/EDIFACT Draft Directory", United Nations Economic Commission for Europe, (undated),


EDIFACT is one of the two internationally cited family of standards for Electronic Data Interchange (EDI). The other standard is the USA's ANS X12 Syntax. In most cases the same metadata elements can be used with EDIFACT and ANS X12:

This code list is used by United States Government contracting and grant activities to indicate the data expressions that are contained herein. It is designed principally for use with Electronic Date Interchange (EDI) in either the American National Standard X12 syntax or the United Nations/Electronic Data Interchange for Administration, Commerce, and Transport (UN/EDIFACT) syntax. It may be used in other data systems as appropriate, to include as domain values for standard data schemes or as application data. ...

From: Federal Procurement Code List One (FP1), National Institute of Standards and Technology, 1998 (Revised: April 25, 2001), URL:
No longer on-line, copy at URL:

ANS X12 Example


Small Disadvantaged Business Performing in the US


Other Small Business Performing in the US


Large Business Performing in the US


Javits-Wagner-O'Day Act (JWOD) Participating Nonprofit Agencies




Foreign Concern/Entity ...

From: Federal Procurement Code List One (FP1), National Institute of Standards and Technology, 1998 (Revised: April 25, 2001), URL:

USA Standards for Business Forms

Standards exist for electronic versions of commonly used business forms, such as invoices and Remittance Advice:

From: Federal Procurement Code List One (FP1), National Institute of Standards and Technology, 1999 URL:

An XML/EDI: Payment Order

The Interim Report for CEN/ISSS XML/EDI Pilot Project give the example of an XML version of an EDIFACT National Payment Order:

<?xml version="1.0"?>
<!DOCTYPE PAY-NAT SYSTEM "pay-nat.dtd">
<PAY-NAT RefNo="0005">
<DTM1 Type="203">19970815</DTM1>
<FII Party="OR">

From: "Interim Report", CEN/ISSS XML/EDI Workshop, 2000, Archived at URL:

Payment Order Elements

Some elements used are:

PAY-NAT Container for the message segments ...
BGM Identifies the beginning of the message...
MOA Monetary amount of payment. Defaults to GBP - Pounds sterling ...
FII Container for financial institution information...

From: "Interim Report", CEN/ISSS XML/EDI Workshop, 2000, Archived at URL:


Part of the XML document type definition (DTD) of this message is:

   UN-EDIFACT:Prefix    CDATA   #FIXED   "UNH"
   RefNo                CDATA            #IMPLIED
   MessageTypeID        CDATA   #FIXED   "PAYEXT"
   Version              CDATA   #FIXED   "D"
   ReleaseNumber        CDATA   #FIXED   "96A"
   Agency               CDATA   #FIXED   "UN"
   AssociationCode      CDATA   #FIXED   "SIMP01" >
   UN-EDIFACT:Prefix    CDATA   #FIXED   "MOA"
   Type                 CDATA   #FIXED   "9"
   Currency             CDATA            "GBP" >

From: "Interim Report", CEN/ISSS XML/EDI Workshop, 2000, Archived at URL:

This is a reasonably readable example. However, there is a bewildering array of such proposed standards. Also commercial vendors of electronic document and e-commerce products use variations of standards, draft proposed standards, or attempt to create defacto standards based on market dominance.

W3C XML E-commerce Standards

W3C provide a very useful table to compare XML protocols . As with all good standards development, W3C has been taking technologies developed by industry and turning them into standards. W3C started at the bottom end, developing technical document standards and has more recently working its way up into data definitions, structure, transaction formats and discovery services.
The XML e-commerce standards are relatively new. There tends to be a heavy overlap of the companies involved. SOAP was developed by a consortium of Ariba, Inc., Commerce One, Inc., Compaq, HP, IBM, Microsoft, SAP and other major companies and is now being standardised by W3C. BizTalk was developed by Microsoft. WSDL was developed by Ariba, IBM and Microsoft. Beyond W3C's technical brief there are other standards which describe specific commercial transactions, such as EbXML from UN/CEFACT oasis.

Making th situation more confusing is the overlap between business domains and technical standards. Early work mixed up the development of what sort of business information could be described (for example a payment advice note) and the format in which the information was encoded (such as in XML). Also many of the standards document are difficult to find, being stored in large PDF documents or at web addresses which change (where is the document defining Microsoft's BiZTalk).

The W3C standards publication process has greatly improved this situation by providing well formatted web documents which are easily found at fixed URLs and by avoiding addressing the business domain. It is easy to find a W3C standard using a web search, to copy a section out of it and paste it (complete with formatting) into a document and to cite the URL of the standard with a reasonable expectation it will still be there when someone goes looking for it. What is needed is for those proposing business standards to follow W3C's lead, by providing documents addressing the business domain and which can be used easily.


Web Services Description Language


A lightweight protocol for exchanging structured information in a decentralized, distributed environment.

XML Schema

For describing the structure and constraining the contents of XML 1.0 documents

Document Related Standards


Extensible Stylesheet Language


XSL Transformations: For transforming XML documents into other XML documents.


XHTML subset for Small Information Appliances


Extensible Markup Language