Wednesday, August 26, 2009

Australian National Data Service

Greetings from the Australian National University in Canberra where Ian Barnes is giving a presentation on the Australian National Data Service to my e-commerce students. ANDS aims to make the data behind research in Australia accessible online. The services provided are both for human readers to search for data and then machine to machine web services to access the data.

One interesting question is why scientists would share their data. One reason would be that this will tend to result in the scientist's work being more widely cited and thus promoting their career. A less obvious reason is that by making the data available will help ensure the data is preserved for long term use, including by the original creator. Another reason is that this may be required by funding bodies to prevent academic fraud.

Interestingly some of the initial data in the ANDS system is research data about humpback whales near the location for the proposed north west multi-billion dollar gas platform.

The service provides a google map interface using the geotagging information in the metadata. The service uses a different metadata standard to the ISO19115/19139 standard which is used for some repositories, but there is provision for conversion of the data. The service uses the same OAI interface as used for electronic document repositories. The service also provides persistent identifiers.

The service has a harvesting process to search registered data sources to find new collections of data to index. The service uses RIF-CS based on ISO2146 to represent data in XML format.

Labels: , ,

Sunday, July 12, 2009

Designing a course module in Metadata and Electronic Data Management - Part 3

Having the general direction for the course module on Metadata and Electronic Data Management, what should the students be able to do at the end of the IT in e-Commerce course? The numerous seminars on how to design courses I have attended over the last year have emphasised the importance of learning objectives and of assessment as part of the learning process. This is not just about setting a test at the end to see the students can remember things.

In order to prepare some Learning Outcomes, I did a web search for other courses on metadata and document management to see what they had. The first found was the University of Manchester's "COMP30352: Information Retrieval, Hypermedia and the Web", however this seems more of a web course. The second found was "IT in E Commerce COMP6341" at the ANU. It took me some moments to realise this was the course I was teaching. Someone had already written the learning outcomes:
Learning Outcomes:

The focus of this course is on document representation, knowledge discovery, storage and retrieval, and electronic trading. The areas covered include XML, XSL, DTD, metadata, data management and different forms of trading such as deliberative, spontaneous and auctions. Other topics will be included to match recent developments and maturation of the area, such as web application frameworks, web services and the semantic web Rationale Electronic Commerce is an area that is growing in leaps and bounds. The use of information technology is at the heart of electronic commerce. It is important that students doing a degree in Information Systems have a sound understanding of the role that information technology plays in electronic commerce. This course, along with the course on Internet, Intranet and Document Systems, is meant to do just that. It looks at some of the current and potential uses of information technology in electronic commerce. The topics covered include document representation in the form of XML, XSL, DTD's; knowledge discovery using metadata and data mining; data management as in the case of Digital Libraries and Electronic Document Management; trading, including deliberative, spontaneous and auctions; and security (public keys, PKI, digital signatures, etc). Other topics would be included as the area matures. It is anticipated that this course will be of interest to people in the industry as well.

This course is responsible for:

  • current trends in representation of data and documents on the web
  • knowledge discovery in the form of metadata and data mining
  • database management in electronic commerce
  • electronic trading
  • security in electronic commerce.

The following topics will be addressed:

  • knowledge representation - XML, XSL, DTD, CSS
  • knowledge discovery - metadata and data mining.
  • data management - digital libraries and electronic document management
  • trading - deliberative, spontaneous and auctions
  • security - public keys, symmetric keys, PKI, authentication, digital signatures, etc.

Upon completion of this course, the student will be able to do the following:

  1. Describe the XML language, write simple DTD's, write CSS style sheets for documents, and explain where XML can be applied to advantage and why.
  2. Describe the use of metadata, and describe the current trends in data mining.
  3. Describe how digital libraries and electronic document management work.
  4. Describe the different kinds of trading that an individual, or an organisation, can do electronically. Explain the advantages and limitations of electronic trading, and the risks involved.
  5. Explain why security is such a big issue in electronic commerce and how it is being addressed. Describe key concepts like public keys, symmetric keys, PKI, authentication and digital signatures. Given a system specification, come up with a design that allows secure transmission of information.
From: "IT in E Commerce COMP6341", Course Details, ANU, 2009

The last part which is of interest, saying what the student should be able to do on completion of the course:
  1. Describe the XML language, write simple DTD's, write CSS style sheets for documents, and explain where XML can be applied to advantage and why.
  2. Describe the use of metadata, and describe the current trends in data mining.
  3. Describe how digital libraries and electronic document management work.
  4. Describe the different kinds of trading that an individual, or an organisation, can do electronically. Explain the advantages and limitations of electronic trading, and the risks involved.
  5. Explain why security is such a big issue in electronic commerce and how it is being addressed. Describe key concepts like public keys, symmetric keys, PKI, authentication and digital signatures. Given a system specification, come up with a design that allows secure transmission of information.

The wording of this is curiously loose, for example "...why security is such a big issue ...". Also use of the term "describe" seems too passive for a IT course, which should be about being able to do things, not just describe them.
  • Describe the XML language, write simple DTD's, write CSS style sheets for documents, and explain where XML can be applied to advantage and why.
  • Describe the use of metadata, and describe the current trends in data mining.
  • Describe how digital libraries and electronic document management work.
  • Describe the different kinds of trading that an individual, or an organisation, can do electronically. Explain the advantages and limitations of electronic trading, and the risks involved.
  • Explain why security is such a big issue in electronic commerce and how it is being addressed. Describe key concepts like public keys, symmetric keys, PKI, authentication and digital signatures. Given a system specification, come up with a design that allows secure transmission of information.
  • Extracting the items relating to metadata and electronic document management:
  • Describe the use of metadata ...
  • Describe how digital libraries and electronic document management work.
  • A better way to put this may be:
    1. Use the XML language to define document strutures
    2. Use XSLT to transform documents and CSS to present them
    3. Use metadata to describe documents for use in digital libraries and electronic document management
    In the course I previously spent a lot of time describing how e-publishing systems worked in general, and the history of publishing, to provide a context for XML based publishing. This is of little interest to current day students of IT, to whom paper publishing and library card catalogues are not part of their experience, having been born after e-publishing and computer catalogues had become the norm.

    Also I spent a lot of time saying what was wrong with PDF. While there is still much wrong with PDF, there seems little point in spending time on that, when instead alternatives could be presented. Otherwise this is much like presenting what is wrong with private cars and roads to transport engineers.

    Some other parts of the course can be emphasised. As an example the IFIP Digital Library which was speculated about last year has now become a reality, with the ANU providing the system for users around the globe. It is unlikely that students will have much interest or understanding of the idea that the material in the digital library was once available primarily on paper. They may also have difficulty making the connection between the digital library and the buildings on campus which are still called a library. The lower floors of these buildings have been cleared of most paper, to provide space for computer access, with perhaps a few serials and new books on display as historical curiosities.

    Labels: , , , ,

    Designing a course module in Metadata and Electronic Data Management - Part 2

    Having worked out how much material is needed for a course module on Metadata and Electronic Data Management, what exactly is it for? The description of the IT in e-Commerce course refers to: "... document representation (XML, XSL, DTD, CSS), knowledge discovery (meta-data, information retrieval), data management (digital library, electronic document management), trading (spontaneous, deliberative, auctions) and security (encryption, public key, symmetric key, PKI, authentication, etc). ..."). So the course is about how to design e-documents, protect and manage them, so that they can be found and used for transactions in business.

    The ANU is in Canberra, the seat of Australian Government and many students work for the government and so many of the examples in the course are drawn from government business. Also because some of the students go on the be academics and researchers, the example of academic publishing has been used as an example.

    There are some common problems for people in business, government and academia: how do I create an e-document which will be flexible for use by different people at different times? How can it be kept? How can it be found? How can it be authenticated?

    The problem with e-documents is coping with the volume of material. Workers are being overwhelmed with the volume of email and attachments. Just as they get used to e-mail, along comes blogs, wikis, twits and other technologies to cope with.

    The course teaches the use of XML based technology. The idea is that you create the documents in a format which reflects the information content, separate from how the document will look to the reader. This goes beyond the separation of structure from presentation for web pages. With a HTML document, if you strip off the presentation layer, the document still looks like a text document. However, with XML data, with the data definitions removed, you have just a jumble of letters and numbers.

    The key point in terms of knowledge discovery is metadata. The metadata can be used to find the data and also substitute for it in many processes. In the case of XML documents metadata is also used to define the data structure.

    Students have considerable difficulty understanding what metadata is. The popularisation of metadata trough Tags on web resources, such as images, blog postings and instant messages, provides a useful example.

    Previously I introduced metadata from the technical point of view and then illustrated it with popular examples such as Tags. Perhaps it might be to reverse this and introduce tags first.

    In introducing electronic document management I went into considerable detail about the procedures used by the Australian Government. While this was popular with professional records managers and archivists, it was of little interest to IT students. It also seems a loosing battle in the government with such records management systems falling into disuse. While I can't solve the problems of the government by myself, perhaps I can suggest some different techniques to the students.

    Deleting most of the material about records management procedures will make room for some new material on new electronic formats for use by business.

    Labels: , , , ,

    Saturday, July 11, 2009

    Designing a course module in Metadata and Electronic Data Management

    How do I create a course module on "Metadata and Electronic Data Management"? This year I have again been asked to help teach students in the course Information Technology in Electronic Commerce (COMP3410) at ANU.

    The content will be much the same as last year, but I would like to package the material up more neatly. This is partly prompted by my resolution last year that I had given my last lecture. Also the material currently lacks a coherent theme as is much longer than it should be. In addition I would like to revise some of the material which is based on old EDI standards and old Australian government records management guidelines.

    How much?

    But where to start? The first step is to get some idea of how much material is required. Previously I gave about five or six lectures and a lab covering the material. This equates to about two weeks of a course.

    Last years notes for the course are the equivalent of 36 A4 pages, or about 18 pages per week. At one end of the spectrum my notes for Green ICT Strategies (COMP&310) are about 3 A4 pages per week, whereas the web technology lectures for COMP2410/6340 - Networked Information Systems are 24 pages per week. This range can be accounted for by the Green ICT course being at the masters level and assuming the student does more independent reading. Also the Green ICT notes are mostly English text, whereas the web technologies notes contact examples of code, which take up more space. So at 18 pages per week, the metadata and data management notes seem about right, but perhaps could be trimmed a little.

    Where does it fit in the skill set?

    The Metadata and Electronic Data Management materials was just whatever I thought might be relviant, when first presented in 2000. It was designed to fit with what else was included in the course and related courses, but no thought to how it fitted in the career of the people who were being trained.

    To position the Green ICT Strategies course, the Skills Framework for the Information Age (SFIA) was used. A search of SFIA found only one Skill definition which mentioned metadata, which was Information management (IRMG) :
    The overall management of information, as a fundamental business resource, to ensure that the information needs of the business are met. Encompasses development and promotion of the strategy and policies covering the design of information structures and taxonomies, the setting of policies for the sourcing and maintenance of the data content, the management and storage of electronic content and the analysis of information structure (including logical analysis of data and metadata). Includes overall responsibility for compliance with regulations, standards and codes of good practice relating to information and documentation records management, information assurance and data protection. ...

    From: Information management (IRMG) , Strategy & planning, Information strategy, SFIA, Version 3, 2005
    For the undergraduate version of the course this would be at SFIA level 4 and Level 5 for the postgraduate version. The higher SFIA level has more management and less technical responsibility.

    A search of SFIA for "data management" turned up reference in Business analysis (ANAL), System software SYSP and Enterprise architecture STPL. None of these seem to fit with the intended content, the closes is business analysis, but that has too much business and not enough technology.

    A search of SFIA for "records management" turned up the
    Information management (IRMG) skill again.

    A search for "publishing" found Information content publishing ICPM, but this seems to relate more to web design.

    So of all these
    Information management (IRMG) seems most relevant.

    Metadata and data management for governance

    Looking at the higher level, IM is in the SFIA Subcategory of Information strategy. This also includes the Corporate governance of IT (GOVN). At first glance governance does not seem relevant to metadata and data management, being more for a course on IT project management.

    However, many of the examples I use to explain the uses of metadata and data management from government and involve the keeping of records for demonstrating that an organisation is being properly run. It occurred to me that it might be useful to turn around the emphasis on record keeping in case you are taken to court, to instead start by looking at what is needed in terms of electron communications and documents for running an organisation well at the highest level, that is governance. With this I could start off with the principles of governance and then show how to make effective use of tools like instant messaging and blogs in a corporate environment.




    Labels: , , , ,

    Friday, July 10, 2009

    The Obama effect: how to win a political campaign with the web

    This year I have again been asked to help teach students about Metadata and Electronic Data Management in the course Information Technology in Electronic Commerce (COMP3410) at ANU. The content will be much the same as last year, but I would like to introduce some central themes to what otherwise looks like a random collection of technologies and techniques.

    Recently a policy adviser for an MP pointed out that there is an election coming up in Australia relatively soon and that political parties are interested in how President Obama used the web to help win. They suggested if I was to offer a course in how to do web based campaigning that would be very popular. How to do political campaigning is not something I have experience, nor much interest, in. What I find more interesting is how to use the technology to run the country, after you win the election.

    What I have found disappointing in the Australian case is that while a political party might use the Internet to help win the election, once in office, this is forgotten and the political process reverts to a manual one, which disenfranchises most of the citizens. Similarly the public service might use the Internet to put out public information campaigns, but seems unable to use the technology to communicate effectively with the public.

    Perhaps the solution to this is to bring the political and administrative processes to together and use a uniform set of technologies and techniques for both. That way the politicians could run the campaign and then be ready to work the same way to run the country. At the same time the public service would not see this way of working as alien.

    To put all that in a more concrete way, I thought I might give one of my seminars for COMP3410, to which the political advisers would be invited, on:
    The Obama effect: how to win a political campaign with the web

    Using blogs, twitter, Google Wave, email, podcasts, the web and the Internet to run a campaign and a country.

    Abstract: Much has been written about how President Obama used the web to campaign. The next Australian federal election must be held by April 2011. By then there will be a new generation of Internet and mobile phone technology. How will it be used for campaigning? Can the technology extend beyond the election campaign to give Australian citizens more of a voice in policy development and running the country? Are there some general principles which can be applied to existing and emergence technologies? Tom Worthington explains how the Metadata and Electronic Data Management techniques underlying web technologies can provide a road map to the future.

    See:

    Labels: , ,

    Saturday, July 04, 2009

    Video Archiving System for the Australian Parliament

    The Department of Parliamentary Services has issued a Request for Tender for "Provision of Equipment and Services for Media Asset Management and Archiving Solution". This is a digital system for the capture, delivery, and archiving of audio and video from the House of Representative, Senate, hearings and the like. There is also 55,000 hours of broadcast quality video to be digitised and catalogued. There is a detailed tender document available online, with Metadata Exchange, Data Storage Backup and Recovery Principles, Style Manual, User Permissions, ICT Architecture and Standards. Also there will be a briefing at Parliament House, 21 May 2009.
    DPS is seeking:

    1. hardware, software and services to construct a digital system for the capture, segmentation, delivery, archiving and management of audiovisual and audio-only content; and
    2. additional services to digitise the estimated 55,000 hours of recorded broadcast quality video of Parliamentary proceedings and special events (back-capture project). ...

    From: Provision of Equipment and Services for Media Asset Management and Archiving Solution, DPS09001, Department of Parliamentary Services, 5-May-2009

    Labels: , , , ,

    Wednesday, May 06, 2009

    Parliament Media Archive Specification

    The Federal Department of Parliamentary Services has issued a Request for Tender for "Provision of Equipment and Services for Media Asset Management and Archiving Solution" (ATM ID DPS09001, 5-May-2009). They are seeking hardware, software and services for digital capture, segmentation, delivery, archiving and management of audiovisual content from the Parliament. They also have an existing 55,000 hours of broadcast quality video of Parliamentary proceedings to digitise. What makes this interesting is that the tender documents (available to registered companies) include a detailed specification of the system including free TV metadata:

    6 STATEMENT OF REQUIREMENT (SOR) ... 33
    6.1 Overview ... 33
    6.2 Content Capture ... 35
    6.3 Content Enrichment ... 39
    6.4 Non Linear Editing ... 41
    6.5 Content Storage ... 42
    6.6 Content Distribution—Platforms ... 43
    6.7 Content Distribution—Client Access ... 44
    6.8 System Management—Administration .... 48
    6.9 System Management—Health/Status Monitoring .... 49
    6.10 System Management—Information Reporting ... 50
    6.11 System Management—Account Access and Content Security ... 51
    6.12 Data Migration of File Based Content ... 52
    6.13 System Integration ... 52
    6.14 System Architecture and Technology ... 54
    6.15 Project Implementation Plan .... 55
    6.16 Acceptance Testing ... 56
    6.17 Documentation .... 58
    6.18 Training .... 60
    6.19 Support and Maintenance ... 60
    6.20 Additional Services—Back-Capture of Archive Content ... 61
    ...
    9 STAGE 3—TECHNICAL REQUIREMENTS ... 79
    9.1 Solution Overview ... 79
    9.2 Content Capture ... 79
    9.3 Content Enrichment ... 85
    9.4 Non-Linear Editing .... 89
    9.5 Content Storage ... 90
    9.6 Content Distribution—Platforms ... 93
    9.7 Content Distribution—Client Access ... 94
    9.8 System Management—Administration ... 100
    9.9 System Management—Health/Status Monitoring ... 103
    9.10 System Management—Information Reporting .... 104
    9.11 System Management—Account Access and Content Security .... 106
    9.12 Data Migration of File Based Content ... 107
    9.13 System Integration .... 108
    9.14 System Architecture and Technology ... 110
    9.15 Project Implementation .... 113
    9.16 Acceptance Testing .... 113
    9.17 Documentation ... 116
    9.18 Training ... 118
    9.19 Support and Maintenance ... 119
    9.20 Additional Services—Back-Capture of Archive Content ... 121

    Appendix A—Metadata Exchange
    Appendix B—EMMS Metadata
    Appendix C—Data Storage Backup and Recovery Principles
    Appendix D—APH Style Manual
    Appendix E—User Permissions
    Appendix F—APH ICT Architecture and Standards
    Appendix G—Chamber Microphone Interface
    Appendix H—ScheduAll Data Schema
    Appendix I—ParlInfo Schema
    Appendix J—Broadcast Standards Manual ...

    From Table of Contents, "Provision of Equipment and Services for Media Asset Management and Archiving Solution", Department of Parliamentary Services (ATM ID DPS09001, 5-May-2009).

    Labels: , ,

    Friday, August 22, 2008

    Australian Open Source Disaster Management Software Released

    Renato Iannella from NICTA has announced that they have released open source software for disaster management: Cooperative Alert Information and Resource Notification System (CAIRNS). This is intended to demonstrate interoperability of Crisis Information Management Systems (CIMS). It uses XML standards: Emergency Data Exchange Language Distribution Element (EDXL-DE), Emergency Data Exchange Language Resource Messaging (EDXL-RM) and Common Alerting Protocol (CAP).

    Labels: , , ,

    Sunday, August 17, 2008

    What is wrong with the OJS Schema?

    Since I was testing the IFIP Digital Library OAI interface, I thought I might as well test the one for the ACS Digital Library and register it with the Open Archives Initiative - Repository Explorer, so researchers could use it as an example of a working system. But it failed the XML Schema validation test. This was the only test failed, but is an important one and is a bit odd, as the archive seems to be working okay. Any one any thoughts?

    Open Archives Initiative - Repository Explorer

    explorer version - 1.46c : protocol version - 2.0 : December 2006
    Open Archives Initiative :: Protocol for Metadata Harvesting v2.0
    RE Protocol Tester 1.46c :: UCT AIM :: December 2006

    (1) Testing : Identify
    URL : http://dl.acs.org.au/index.php/index/oai?verb=Identify
    Test Result : OK
    ---- [ Repository Name = ACS Digital Library ]
    ---- [ Protocol Version = 2.0 ]
    ---- [ Base URL = http://dl.acs.org.au/index.php/index/oai ]
    ---- [ Admin Email = dl@tomw.net.au ]
    ---- [ Granularity = YYYY-MM-DDThh:mm:ssZ ]
    ---- [ Earliest Datestamp = 2006-12-05T00:40:05Z ]

    (2) Testing : Identify (illegal_parameter)
    URL : http://dl.acs.org.au/index.php/index/oai?verb=Identify&test=test
    Test Result : OK

    (3) Testing : ListMetadataFormats
    URL : http://dl.acs.org.au/index.php/index/oai?verb=ListMetadataFormats
    Test Result : OK
    ---- [ Only oai_dc supported ]

    (4) Testing : ListSets
    URL : http://dl.acs.org.au/index.php/index/oai?verb=ListSets
    ------ Response from Xerces Schema Validation ------
    [Error] re.51bDjv:164:37: cvc-pattern-valid: Value 'crpit:Volume 1' is not facet-valid with respect to pattern '([A-Za-z0-9\-_\.!~\*'\(\)])+(:[A-Za-z0-9\-_\.!~\*'\(\)]+)*' for type 'setSpecType'.
    [Error] re.51bDjv:164:37: cvc-type.3.1.3: The value 'crpit:Volume 1' of element 'setSpec' is not valid.
    /tmp/re.51bDjv: 1632;20;0 ms (76 elems, 52 attrs, 0 spaces, 897 chars)
    ------- End of Xerces Schema Validation Report -------
    ------ Start of XML Response ------

    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
    http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
    2008-08-17T01:32:59Z
    http://dl.acs.org.au/index.php/index/oai


    ajis
    Australasian Journal of Information Systems

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    ajis:ED
    Editorial

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    ajis:FT
    AJIS Featured Theme

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    ajis:ART
    Articles

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    ajis:FTA
    AJIS Featured Theme

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    ajis:FTB
    AJIS Featured Theme

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    ajis:FTC
    AJIS Featured Theme

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    jrpit
    Journal of Research and Practice in Information Technology

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    jrpit:ART
    Articles

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    crpit
    Conferences in Research and Practice in Information Technology

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    crpit:ART
    Articles

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">





    crpit:Volume 1
    Volume 1

    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">







    ------- End of XML Response -------
    Test Result : FAIL!
    **** [ERROR] XML Schema validation failed ...

    From: "Open Archives Initiative - Repository Explorer" test of
    ACS DL, explorer version - 1.46c : protocol version - 2.0 : December
    2006 (run 17 August 2008)

    Labels: , , ,

    Testing IFIP Digital Library OAI Interface

    For those wondering what the OAI interface I was asking librarians to test for the IFIP Digital Library: The Open Archives Initiative (OAI) defines a standard computer to computer interface (OAI-PMH) which can be used to query a digital library to see what information it has and in what formats. The IFIP Digital Library uses the Open Journal Systems (OJS) open source publishing system which has OAI built in.

    Normally you would point your own OAI software at the address of the IFIP DL and it would send transactions and extract data into your system. But you can also use some test tools to see what is happening. One of these is the Open Archives Initiative - Repository Explorer. With this you supply the address on an OAI system and then can pass queries and see the data. There is also an automatic test which sends a series of queries and checks the answers. These are used in the Conformance Testing for Basic Functionality, Error and Exception Handling by the OAI Registry. Here are the results of the test on the IFIP DL:

    Open Archives Initiative - Repository Explorer

    explorer version - 1.46c : protocol version - 2.0 : December 2006
    Open Archives Initiative :: Protocol for Metadata Harvesting v2.0
    RE Protocol Tester 1.46c :: UCT AIM :: December 2006

    (1) Testing : Identify
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=Identify
    Test Result : OK
    ---- [ Repository Name = International Federation for Information Processing Digital Library ]
    ---- [ Protocol Version = 2.0 ]
    ---- [ Base URL = http://dl.ifip.org/iojs/index.php/ifip/oai ]
    ---- [ Admin Email = tom.gedeon@anu.edu.au ]
    ---- [ Granularity = YYYY-MM-DDThh:mm:ssZ ]
    ---- [ Earliest Datestamp = 2008-04-01T10:07:00Z ]

    (2) Testing : Identify (illegal_parameter)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=Identify&test=test
    Test Result : OK

    (3) Testing : ListMetadataFormats
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListMetadataFormats
    Test Result : OK
    ---- [ Only oai_dc supported ]

    (4) Testing : ListSets
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListSets
    Test Result : OK
    ---- [ Sample Set Spec = ifip ]

    (5) Skipping : ListSets (resumptionToken)
    This test is being skipped because it cannot or should not be performed.

    (6) Testing : ListIdentifiers (oai_dc)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc
    Test Result : OK
    ---- [ Sample Identifier = oai:ifip.cs.anu.edu.au:article/711 ]

    (7) Skipping : ListIdentifiers (resumptionToken)
    This test is being skipped because it cannot or should not be performed.

    (8) Skipping : ListIdentifiers (resumptionToken, oai_dc)
    This test is being skipped because it cannot or should not be performed.

    (9) Testing : ListIdentifiers (oai_dc, from/until)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2000-01-01&until=2000-01-01
    Test Result : OK

    (10) Testing : ListIdentifiers (oai_dc, set, from/until)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=ifip&from=2000-01-01&until=2000-01-01
    Test Result : OK

    (11) Testing : ListIdentifiers (oai_dc, illegal_set, illegal_from/until)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=really_wrong_set&from=some_random_date&until=some_random_date
    Test Result : OK

    (12) Testing : ListIdentifiers (oai_dc, from granularity != until granularity)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2001-01-01&until=2002-01-01T00:00:00Z
    Test Result : OK

    (13) Testing : ListIdentifiers (oai_dc, from > until)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2000-01-01&until=1999-01-01
    Test Result : OK

    (14) Testing : ListIdentifiers ()
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers
    Test Result : OK

    (15) Skipping : ListIdentifiers (metadataPrefix)
    This test is being skipped because it cannot or should not be performed.

    (16) Testing : ListIdentifiers (illegal_mdp)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=illegal_mdp
    Test Result : OK

    (17) Testing : ListIdentifiers (mdp, mdp)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&metadataPrefix=oai_dc
    Test Result : OK

    (18) Testing : ListIdentifiers (illegal_resumptiontoken)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&resumptionToken=junktoken
    Test Result : OK

    (19) Testing : ListIdentifiers (oai_dc, from YYYY-MM-DD)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2001-01-01
    Test Result : OK

    (20) Testing : ListIdentifiers (oai_dc, from YYYY-MM-DDThh:mm:ssZ)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2001-01-01T00:00:00Z
    Test Result : OK

    (21) Testing : ListIdentifiers (oai_dc, from YYYY)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&from=2001
    Test Result : OK

    (22) Testing : ListMetadataFormats (identifier)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListMetadataFormats&identifier=oai:ifip.cs.anu.edu.au:article/711
    Test Result : OK
    ---- [ Only oai_dc supported ]

    (23) Testing : ListMetadataFormats (illegal_id)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListMetadataFormats&identifier=really_wrong_id
    Test Result : OK

    (24) Skipping : GetRecord (identifier, metadataPrefix)
    This test is being skipped because it cannot or should not be performed.

    (25) Testing : GetRecord (identifier, oai_dc)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&identifier=oai:ifip.cs.anu.edu.au:article/711&metadataPrefix=oai_dc
    Test Result : OK

    (26) Testing : GetRecord (identifier)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&identifier=oai:ifip.cs.anu.edu.au:article/711
    Test Result : OK

    (27) Testing : GetRecord (identifier, illegal_mdp)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&identifier=oai:ifip.cs.anu.edu.au:article/711&metadataPrefix=really_wrong_mdp
    Test Result : OK

    (28) Testing : GetRecord (oai_dc)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&metadataPrefix=oai_dc
    Test Result : OK

    (29) Testing : GetRecord (illegal_id, oai_dc)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&identifier=really_wrong_id&metadataPrefix=oai_dc
    Test Result : OK

    (30) Testing : GetRecord (invalid_id, oai_dc)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&identifier=invalid\"id&metadataPrefix=oai_dc
    Test Result : OK

    (31) Testing : ListRecords (oai_dc, from/until)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2000-01-01&until=2000-01-01
    Test Result : OK

    (32) Skipping : ListRecords (resumptionToken)
    This test is being skipped because it cannot or should not be performed.

    (33) Skipping : ListRecords (metadataPrefix, from/until)
    This test is being skipped because it cannot or should not be performed.

    (34) Testing : ListRecords (oai_dc, illegal_set, illegal_from/until)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListRecords&metadataPrefix=oai_dc&set=really_wrong_set&from=some_random_date&until=some_random_date
    Test Result : OK

    (35) Testing : ListRecords
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListRecords
    Test Result : OK

    (36) Testing : ListRecords (oai_dc, from granularity != until granularity)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2001-01-01&until=2002-01-01T00:00:00Z
    Test Result : OK

    (37) Testing : ListRecords (oai_dc, until before earliestDatestamp)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListRecords&metadataPrefix=oai_dc&until=2007-04-01T10:07:00Z
    Test Result : OK

    (38) Testing : ListRecords (oai_dc)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListRecords&metadataPrefix=oai_dc
    Test Result : OK

    (39) Testing : ListRecords (illegal_resumptiontoken)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListRecords&resumptionToken=junktoken
    Test Result : OK

    (40) Testing : ListIdentifiers (oai_dc, set)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=ifip
    Test Result : OK

    (41) Testing : GetRecord (identifier, oai_dc)
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&identifier=oai:ifip.cs.anu.edu.au:article/711&metadataPrefix=oai_dc
    Test Result : OK
    ---- [ Found setSpec in header ]

    (42) Testing : IllegalVerb
    URL : http://dl.ifip.org/iojs/index.php/ifip/oai?verb=IllegalVerb
    Test Result : OK

    ---- Total Errors : 0


    From: "Open Archives Initiative - Repository Explorer" test of IFIP DL, explorer version - 1.46c : protocol version - 2.0 : December 2006 (run 17 August 2008)

    The Self description for the archive:

    Open Archives Initiative - Repository Explorer

    explorer version - 1.46c : protocol version - 2.0 : December 2006

    http://dl.ifip.org/iojs/index.php/ifip/oai?verb=Identify

    Archive details : http://dl.ifip.org/

    Archive Self-Description

    Repository NameInternational Federation for Information Processing Digital Library
    Base URLhttp://dl.ifip.org/iojs/index.php/ifip/oai
    Protocol Version2.0
    Admin Emailtom.gedeon@anu.edu.au
    Earliest Datestamp2008-04-01T10:07:00Z
    Deleted Record Handlingno
    GranularityYYYY-MM-DDThh:mm:ssZ
    Compressiongzip
    Compressiondeflate
    Other Information
    description:
    oai-identifier:
    scheme: oai
    repositoryIdentifier: ifip.cs.anu.edu.au
    delimiter: :
    sampleIdentifier: oai:ifip.cs.anu.edu.au:article/1

    Request : http://dl.ifip.org/iojs/index.php/ifip/oai, verb=Identify
    Response Date : 2008-08-17T04:35:13Z

    Requesting a list of metadata formats returns:

    Open Archives Initiative - Repository Explorer

    explorer version - 1.46c : protocol version - 1.0/1.1/2.0 : December 2006

    http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListMetadataFormats

    Archive details : http://dl.ifip.org/

    List of Metadata Formats

    Click on the link to view schema

    Prefix=[oai_dc]
    NameSpace=[http://www.openarchives.org/OAI/2.0/oai_dc/]
    Schema=[http://www.openarchives.org/OAI/2.0/oai_dc.xsd]

    Prefix=[oai_marc]
    NameSpace=[http://www.openarchives.org/OAI/1.1/oai_marc]
    Schema=[http://www.openarchives.org/OAI/1.1/oai_marc.xsd]

    Prefix=[marcxml]
    NameSpace=[http://www.loc.gov/MARC21/slim]
    Schema=[http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd]

    Prefix=[rfc1807]
    NameSpace=[http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1807.txt]
    Schema=[http://www.openarchives.org/OAI/1.1/rfc1807.xsd]


    Request : http://dl.ifip.org/iojs/index.php/ifip/oai, verb=ListMetadataFormats
    Response Date : 2008-08-17T00:56:06Z

    All the records in the archive can then be displayed:

    Open Archives Initiative - Repository Explorer

    explorer version - 1.46c : protocol version - 2.0 : December 2006

    http://dl.ifip.org/iojs/index.php/ifip/oai?verb=ListIdentifiers&metadataPrefix=oai_dc

    Archive details : http://dl.ifip.org/

    List of Record Identifiers

    Select a link to view more information ...

    header: identifier : oai:ifip.cs.anu.edu.au:article/711 datestamp : 2008-04-01T10:07:00Z setSpec : ifip:SS [display record in Dublin Core] [display metadata formats]

    ...

    header:
    identifier : oai:ifip.cs.anu.edu.au:article/1273
    datestamp : 2008-07-16T10:07:00Z
    setSpec : ifip:SS

    [display record in Dublin Core] [display metadata formats]


    Request : http://dl.ifip.org/iojs/index.php/ifip/oai, verb=ListIdentifiers, metadataPrefix=oai_dc
    Response Date : 2008-08-17T03:31:36Z

    The metadata for a record can be selected:

    Open Archives Initiative - Repository Explorer

    explorer version - 1.46c : protocol version - 2.0 : December 2006

    http://dl.ifip.org/iojs/index.php/ifip/oai?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai%3Aifip.cs.anu.edu.au%3Aarticle%2F1273

    Archive details : http://dl.ifip.org/

    List of Fields

    header:
    identifier : oai:ifip.cs.anu.edu.au:article/1273
    datestamp : 2008-07-16T10:07:00Z
    setSpec : ifip:SS

    metadata:
    dc:
    title: Differentiated Survivability in a Distributed GMPLS-Based IP-over-Optical Network
    creator: David Harle
    creator: Saud Albarrak
    subject:
    subject:
    subject:
    description: A key element of the survivability cost is the amount of the additional capacity embedded in the network for recovery purposes. Generally, Internet Service Providers (ISPs) have the obvious aim of achieving the required level of survivability with minimum resource consumption and network cost. Therefore, in order to achieve such a challenge, it is necessary to move towards the multilayer differentiated survivability concept. In this paper, the focus is given to the investigation of the application of differentiated survivability concept with pre-allocated restoration technique considering a distributed GMPLS-based IP-over-optical mesh network under single and dual-link failure scenarios.
    publisher: International Federation for Information Processing
    contributor:
    date: 1970-01-01
    type: Peer-reviewed Article
    type:
    format: application/pdf
    identifier: http://dl.ifip.org/iojs/index.php/ifip/article/view/1273
    source: International Federation for Information Processing Digital Library; Optical Network Design and Modelling (ONDM 2008);
    language:
    coverage:
    coverage:
    coverage:
    rights: Copyright© International Federation for Information Processing. Most IFIP papers are Under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.


    Request : http://dl.ifip.org/iojs/index.php/ifip/oai, verb=GetRecord, metadataPrefix=oai_dc, identifier=oai:ifip.cs.anu.edu.au:article/1273
    Response Date : 2008-08-17T04:31:28Z

    Labels: , , ,

    Sunday, August 10, 2008

    Web 2 format for Environmental Product Data?

    The Department of the Environment, Water, Heritage and the Arts (DEWHA) has issued a Request for Tender for Provision of Desktop, LAN, Helpdesk and Midrange Services. This asks for detailed environmental performance data, so I asked my ANU metadata students how that could be supplied via the web:
    COMP6341 students: The Department of the Environment, Water, Heritage and the Arts (DEWHA) issued a Request for Tender for "Provision of Desktop, LAN, Helpdesk and Midrange Services" last week. The 278 page tender document includes extensive and detailed environmental requirements.

    Tenderers are required to provide the information specified in the IEEE 1680 environmental standard. The Electronic Product Environmental Assessment Tool (EPEAT) is a spreadsheet manufacturers can use to fill in to rate their products and then upload the results to a central database. The information can be displayed as a web page and can be exported as a spreadsheet.

    But less than a thousand computer products are listed by EPEAT. Many more have not been rated using their system and manfacturers may also have to list their products under schemes in Europe and elsewhere. Manufacturers already provide product information on their web sites. Having to supply data in different formats to different rating organisations is an addition burden.

    Look at the EPEAT entry for the DELL OptiPlex 745 Energy Smart:

    1. What are some options for marking up the information about the DELL OptiPlex 745 in an HTML and/or XML format, so it could be read by the general public as a web page and also automatically input the databases of different rating bodies? The web page should similar to the ones Dell supplies, but have metadata embedded in it.
    2. What are the benefits of different ways to mark up the metadata?

    Labels: , , , ,

    Monday, March 10, 2008

    Metadata Management Forum

    The 4th Annual Metadata Management Forum is 21-22 Apr 2008 in Melbourne. I am speaking on "The importance of metadata within a search engine context – Metadata versus Google". Other speakers include Rhonda Bradford, Australian Taxation Office on "Metadata implementation", Bala Rasaratnam,
    National Australia Bank on "Managing the longevity of metadata", and Marcus Falley, Coles Group on "Using metadata to facilitate enterprise search".

    Somehow I have to explain the following in 45 minutes:
    • Custom software versus the web
    • Improving tracking results with better search engine performance
    • Understanding and weighing the importance of metadata and matching that to specific actions that would improve search engine performance
    • Evaluating critical steps undertaken to align and leverage between a business/product strategy and an enterprise strategy
    • Making an improvement to the underlying product development, takeup
    and support through metadata interpretation, web trend analysis and campaign management.
    The draft program:
    Conference Programme Day One
    Monday April 21st 2008

    0830 Registration & coffee

    0850 Welcoming address from the Chair
    Mark Brannigan President
    Data Warehouse Association Australia

    0900 Session One – Expert Advice

    Standards and uptake of metadata
    • Assessing the level of uptake and implementation of metadata among Australian organisations
    • Exploring the emerging standards and opportunities for integration
    • Leveraging metadata to ensure a greater integration of databases to benefit businesses and consumers in the global marketplace
    • Comprehending the value and role of standards
    • Understanding the future directions of metadata
    • Categorising metadata standards – updating your knowledge on standards: which standards and why?
    • Learning from Australian government metadata initiatives to improve record keeping
    • Metadata and governance, including standard deliverables
    • Maintaining metadata standards and guidelines

    Barbara Reed Director
    Recordkeeping Innovation

    0945 Session Two – Case Study

    Metadata implementation – understanding the ins and
    outs
    • Analysing the challenges faced by businesses in their quest to implement metadata
    • Ensuring quality stewardship throughout the entire metadata
    implementation
    • Demonstrating positive outcomes to build momentum and enthusiasm
    • Integrating data, information and knowledge
    • Overcoming resistance when implementing a metadata policy across your organization
    • Considering the advantages of different approaches to make
    integration work
    • Metadata processes and how to establish common definitions

    Rhonda Bradford Senior Data Architect
    Australian Taxation Office

    1030 Morning refreshments & networking break

    1100 Session Three – International Award-Winning Keynote
    Presentation

    Part One

    The business case for metadata – how does metadata
    impact performance?
    • How metadata deployment can give your organization a competitive advantage
    • Weighing the benefits against the resource requirements
    • Clearly defined strategies, goals and benefits of metadata as a business tool
    • Business cases and performance management – where does metadata fit in?
    • Identifying metadata projects that will deliver real benefits to your organisation’s business models

    1145 Part Two

    Managing metadata to reduce costs and aid in decisionmaking
    • Implementing effective strategies that ensure your directories are populated with the most current data
    • Clearly defining responsibilities for the management of diverse metadata
    • Understanding how metadata can be damaging to your organisation’s integrity if it is not managed correctly
    • Lowering costs for implementation and maintenance of your enterprise applications
    • Demonstrating to senior managers the cost savings brought by the implementation of effective Metadata management strategies – thebest way to make your case

    Ron Klein Enterprise Metadata Director
    BMO Financial Group
    2007 Wilshire Award-Winner for Metadata

    1230 Luncheon

    1330 Session Four – Expert Advice

    Implementing a metadata policy for your organisation
    • Understanding the need for having a Metadata policy
    • Analysing winning techniques in developing metadata policy
    • Selling it effectively to the relevant parties in an organisation
    • Evaluating the best way to go about the implementation process

    Kate Walker CEO
    Records Management Association of Australia

    1415 Session Five – Case Study

    Managing the longevity of metadata – the importance of
    quality maintenance
    Implementing metadata will only get the business halfway there. It is essential that a smart strategy is in place to maintain metadata over time.

    This session will offer you the opportunity to consider if you have addressed all the appropriate business requirements to ensure not only a successful uptake and integration, but also to maintain the long-term value of metadata. It is also imperative for business stakeholders to understand that metadata is not a ‘quick-win’ way of saving the business money; it is a part of the business that requires commitment over time to
    yield significant ROI. This session will look at:
    • Metadata management & the issues organisations face in making it real & sustainable
    • Building awareness of metadata maintenance in your organisation
    • Tools and capability
    • Determining whether to centralise or decentralise the maintenance process
    • Identifying the major stakeholders of the initiative to ensure long term success for your project

    Bala Rasaratnam Data Management Lead - Enterprise Services
    Technology
    National Australia Bank

    1500 Afternoon refreshments & networking break

    1530 Session Six – Expert Advice

    The importance of metadata within a search engine
    context – Metadata versus Google
    • Custom software versus the web
    • Improving tracking results with better search engine performance
    • Understanding and weighing the importance of metadata and matching that to specific actions that would improve search engine performance
    • Evaluating critical steps undertaken to align and leverage between a business/product strategy and an enterprise strategy
    • Making an improvement to the underlying product development, takeup and support through metadata interpretation, web trend analysis and campaign management

    Tom Worthington Senior Lecturer
    Australian National University

    1615 Session Seven – Case Study

    Maximising the role of controlled vocabularies to support
    information quality
    • Implementing practical thesaurus development guidelines
    • Understanding a project’s thesaurus scope
    • Developing an effective thesaurus structure
    • Analysing business definitions, values and the role of taxonomies
    • Addressing validation, data dictionaries and thesauri

    Vanessa Booth Content Manager
    Victoria Online - Department of Innovation, Industry and
    Regional Development

    1700 End of Day One

    Tuesday April 22nd 2008
    Conference Programme Day Two

    0830 Morning coffee

    0850 Opening address from the Chair

    0900 Session One – Case Study

    Using metadata to facilitate enterprise search
    • Identifying who the searchers are
    • Understanding the terms and hierarchy of the business taxonomy to
    maximise effective findability
    • Guidelines for building a successful taxonomy – using the KISS approach
    • Categorising documents for search – a decision for governance
    • Examining enterprise search architecture
    • Developing best practice data processes, tools and models to strive for
    metadata management maturity

    Marcus Falley Senior Optimisation and Reporting Analyst
    Coles Group

    0945 Session Two – Case Study

    From silo to single enterprise - developing a whole of
    government metadata system to obtain best value out of
    government information
    • Valuing, describing and publishing government information assets
    • Recognising the information needs of different user communities
    • Examining the technical architecture
    • Challenges and benefits in developing the system
    • The data package - data, metadata and licence

    Jenny Bopp Principal Statistician, Office of Economic and Statistical
    Research
    David Torpie Principal Statistician, Office of Economic and Statistical
    Research
    Queensland Treasury

    1030 Morning refreshments & networking break

    1100 Session Three – Case Study

    Implementing and maintaining metadata – benefits
    achieved and lessons learned at the State Records of South Australia
    Over the past 2 years State Records of South Australia has implemented a records and metadata strategy to facilitate discovery of documents as well as improve staff productivity and management decision-making.
    This session is a case of the implementation of that strategy and will
    cover:
    • Developing and implementing a metadata strategy
    • Identifying the records to be captured, planning the metadata to be collected and implemented
    • Implementing the capture of metadata into work practices
    • Examining policies and procedures, embarking on change management
    and training and maintaining the metadata

    Karen Horsfall Information Management Strategist State Records of South Australia

    1145 Luncheon

    1300 Award-Winning International Case Study & Best Practice Workshop
    Facilitated by:
    Gregg Wyant Chief Architect and General Manager of IT Strategy,
    Architecture & Innovation
    Intel Corporation

    Data quality and Service Oriented Architecture – what are
    the requirements?
    • Evaluating the metadata readiness of your business data, applications and technology architectures
    • Implementing an enterprise architecture framework for explicitly defined assets
    • Web services and the components of Service Oriented Architectures
    • Specific SOA requirements as they pertain to data quality
    • The specific data quality steps that should be taken to ensure success
    • Business impact of information quality
    • How data quality affects the bottom line
    • How SOA repository improves programming productivity and increases re-use
    • How to effectively architect a data integration and date warehouse strategy using an SOA data services approach to accelerate deployment, reduce risks and lower costs

    Connecting metadata to information architecture
    • Relationship between metadata, repository and architecture
    • Metadata Tool Choices: distributed vs centralized
    • Achieving buy-in from the architecture community
    • Establishing authoritative sources of metadata and ownership
    • Choosing between opportunistic systemic approaches to populating metadata
    • Aligning architecture work products to Enterprise Architecture
    • Capturing metadata services in the repository and utilising them at design time by architects

    Analysing enterprise-wide benefits of metadata
    • Measuring reuse of your enterprise architecture assets via metadata
    • Valuing your enterprise assets – calculating business value
    • Metadata governance processes and issues
    • Expanding reuse valuation to all architectural assets
    • Using reuse success to expand your metadata efforts
    1700 Closing remarks from the Chair and end of conference


    Workshop Schedule
    1300 Opening and start of the workshop, Module 1
    1500 Afternoon tea
    1515 Workshop resumes, Module 2
    1700 Close of workshop.
    See also books on:
    1. Information Management
    2. Managing Records
    3. Archives
    4. Information Architecture
    5. Metadata
    6. Electronic Documents
    7. Electronic Publishing
    8. Data Mining
    9. Preserving Digital Information
    10. Public Sector Management
    11. e-Government
    12. Electronic Document Software

    Labels: ,

    Friday, November 09, 2007

    Metadata for Electronic Conveyancing

    In August I wrote about the National Electronic Conveyancing System (NECS) being worked on with standards for buying property online, including lodging land titles. Progress seemed to be slow and now seemed to have got slower, with media reports of disputes between the states. Refreshingly for such a body, the NECS publish their minutes in a blog-like format, so you can read the official version, and then compare this with the media reports.

    Some media reports are about if a federated or centralized model should be used . This is partly a matter of state rivalry and of possible loss of state revenue to a federal body. Little seems to be to do with making a more efficient system for the benefit of homeowners. There are some obvious ways to get around these issues, which NECS do not appear to have employed.

    Labels: ,

    Friday, September 28, 2007

    Metadata for data processing

    CSIRO's ICT Centre held a seminar 28 September 2007 by Roland Viger, of the US Geological Survey, Colorado, USA on "Using geoprocessing specification as semantic metadata with GEOLEM". There is a description of GEOLEM available. The techniques might be applied to business applications.

    Essentially Roland created a portable layer between the Geographic Information System (GIS) which holds data and an environmental model which uses it. He defines "compound commands" which are a small scripting language to be able to take the data from the GIS and assemble it into something meaningful environmentally. The middle layer is written using Java.

    This raises the question as to if this technique could be expanded beyond GISs and environmental applications. Could such scripting languages be used to allow large collections of data to be made understandable for specific groups of users. On the other hand could languages used to define software design, such as used for Shane Flint's Aspect-Oriented Thinking be used for environmental applications, or even language for business logic with ebXML.

    Perhaps these techniques could be used to write mini-languages, using XML syntax, to define transformations. These transformations would then process the data. After many layers of transformation the result would be the one the user wanted.

    Labels: , , ,

    Sunday, September 16, 2007

    Web to reduce UAV bandwidth use

    Unmanned Aerial Vehicles (UAVs), or robot planes, are used for remote surveillance, but use up a lot of bandwidth sending back images. The US military is providing millions of dollars in research on how to reduce the bandwidth needed, but seems to hav missed the obvious: use web technology.

    Reports such as "$10M to Utah State to Help Ease ISR Bandwidth Crunch" (11-Sep-2007 14:45 Watershed Publishing LLC), indicate that networking and image processing will be exploited to reduce bandwidth:

    Utah State University Research Foundation, North Logan, Utah, is being awarded $10M for cost-plus-fixed-fee completion task order #0007 under previously awarded contract (N00173-02-D-2003) for research in the area of Time Critical Sensor Image/Data Processing. Specifically, they will research advanced . ... massive bandwidth crunch being created by hundreds of video-equipped UAVs and networked airborne ISR systems sending video back to base. ... The Naval Research Laboratory, in Washington, DC issued the contract.

    However, a better way to reduce data transmission is not to send the data in the first place. The typical UAV is really just a remote control airplane, like a larger version of a hobby plane. With a little more intelligence the camera can just transmit when there is something interesting to see and at a resolution the user requires. The images can be zoomed in on to provide a high resolution view of a small area. Progressive scanning schemes can be sued to give a low resolution preview and then add detail of the area of interest. The aircraft can store the data for later replay. These are all capabilities available in web image formats and with off the shelf open source web server technologies, rather than something needing millions of dollars in research.

    A hand launched UAV could use a small off the shelf computer such as the business card size Via Mobile-ITX which is intended for use in a smartphone.



    Labels: , , ,

    Saturday, August 04, 2007

    hCalendar Microformat in Moodle Courseware System

    Perhaps someone could try my first attempt at hCalendar and let me know if it is syntactically correct? hCalendar is a microformat for embedding event details in web pages.

    The calendar function in the Moodle courseware system doesn't have hCalendar built in. So I used hCalendar creator to mark up the date, time and other details. I then pasted that into the Moodle editor. Just pasting the entry as is did not look very pretty, so I marked up the text as a normal looking web page. Hopefully I did not break any of the hCalendar markup in the process.

    The microformats idea is clever, in that the hCalendar metadata is added to a normal web page, rather than creating some special new type of document. This is an important feature of the microformats approach: you are not required to implement a strict data format, just use the markup around the bits of the document containing the data for the particular Microformat. The software reading the microformat will ignore the parts of the web page not marked. This is a more flexible approach than creating a special XML document and then transforming it to be readable.

    In theory, suitably equipped web browsers should be able to detect the hCalendar. But I don't have such software, so if someone could try it, that would be useful.

    hCalendar Code

    hCalendar creator is a web service which allows you to type details of an event into a web form and have the hCalendar generated. This is useful for seeing how hCalandar works (ideally the facility will be built into whatever scheduler program you use). The form promos you for the summary, location, url, start time, end tims, timezone and description of the event. It generates two version of the hCalendar: one whitespace to make the code readable and a compact version.

    Example code for event:
    <div class="vevent" id="hcalendar-Broadband-for-Environmental-Sustainability">
    <a class="url" href="http://education.acs.org.au/calendar/view.php?view=day&course=55&amp;amp;amp;amp;amp;cal_d=19&cal_m=9&cal_y=2007">
    <abbr class="dtstart" title="20070919T1730+1000">September 19th 5:30pm</abbr>,
    <abbr class="dtend" title="20070920T1900+1000"> 7pm 2007</abbr> —
    <span class="summary">Broadband for Environmental Sustainability</span>— at
    <span class="location">Australian National University, Room N101, Computer Science Building, North Road, Canberra</span>
    </a>
    <div class="description">Professor Eckermann argues that broadband telecommunications can be used to make a positive contribution to environmental sustainability. Energy inefficient activities can be displaced with earth-friendly alternatives, such as renting a movie via the Internet (avoiding trips to and from the video store), working some days from home (saving travel costs and easing pressures on the road) or using "networked intelligence" to manage energy and water consumption more efficiently.</div>
    <p style="font-size: smaller;">This
    <a href="http://microformats.org/wiki/hcalendar">hCalendar event</a> brought to you by the
    <a href="http://microformats.org/code/hcalendar/creator">hCalendar Creator</a>.
    </p>
    </div>
    This renders in the web browser as:
    September 19th 5:30pm, 7pm 2007Broadband for Environmental Sustainability— at Australian National University, Room N101, Computer Science Building, North Road, Canberra
    Professor Eckermann argues that broadband telecommunications can be used to make a positive contribution to environmental sustainability. Energy inefficient activities can be displaced with earth-friendly alternatives, such as renting a movie via the Internet (avoiding trips to and from the video store), working some days from home (saving travel costs and easing pressures on the road) or using "networked intelligence" to manage energy and water consumption more efficiently.

    This hCalendar event brought to you by the hCalendar Creator.

    This is not the format I want on my web page, so instead I took the various bits of markup for hCalandar and inserted them into my page.

    So where I had the start time, I added around it the markup:
    <abbr class="dtstart" title="20070919T1730+1000">September 19th 5:30pm<
    Around the summary I added:
    <span class="summary">Broadband for Environmental Sustainability</span>
    This points up two different ways the Microformats are marked up. In the case of the time, the data is duplicated in the added markup. The date and time are endcoded in the added abbreviation tag as "20070919T1730+1000" as well as appearing in the rendered text ("September 19th 5:30pm"). In the case of the summary, the visible text is used as the data.

    There is a waste of space and the risk of confusion in the way the time is encoded. Ideally none of the data would be duplicated and the same values would be machine readable and displayable, but the limitations of existing HTML do not make this possible.

    ps: The entry is for a real event in Canberra. All those with an interest in broadband and the environment are welcome to attend.

    Labels: , ,

    Wednesday, August 01, 2007

    Australian GeoNetwork Developers Group

    Geoscience Australia BuildingOn 1 August 2007 I attended the initial meeting of the Australian GeoNetwork Developers Group at the impressive Geoscience Australia (GA) green building in Canberra. These are my notes from the meeting (not official minutes). The GA building is worth visiting, even if you are not having a meeting.

    Geoscience Australia

    ... Geoscience Australia plays a critical role by producing first-class geoscientific information and knowledge. This can enable the government and the community to make informed decisions about the exploration of resources, the management of the environment, the safety of critical infrastructure and the resultant wellbeing of all Australians.

    From: About us, Geoscience Australia, 2007

    Library courtyard in the Geoscience Australia BuildingGA have a public exhibition of minerals and seismic instruments, as well as a map shop (which also sells polished rocks) and a library and cafe, which the public is welcome to use. This is well worth a visit for tourists. Also they are having an open day 26 August 2007.

    Geoscience Australia provides part of the Australian Tsunami Warning System (such systems haven't had all the bugs shaken out of them yet). GA also acquire seismic and other mapping data to help with mineral exploration, natural resource use and are looking to provide access to it via the internet, which is why they were hosting the meeting.

    Office of Spatial Data Management

    The meeting was called by Ben Searle, General Manager, Office of Spatial Data Management in Geoscience Australia, which looks after Government mapping policy:
    The role of the Office of Spatial Data Management (OSDM) is to:
    • provide administrative support to the Spatial Data Policy Executive (SDPE) and the Spatial Data Management Group (SDMG);
    • implement the workplan and manage the working groups established by SDMG;
    • facilitate sharing of experience and expertise between Australian Government agencies;
    • provide technical advice to the SDMG;
    • promote efficient use of Australian Government spatial data assets;
    • represent the Australian Government's interests in spatial data coordination and access arrangements with the States and Territories; and
    • foster the development of a private sector spatial information industry.
    From: About OSDM, Office of Spatial Data Management, Geoscience Australia, 25 Jan 2006
    Australian GeoNetwork Developers Group

    There were about 20 people present at the meeting in the Geoscience's Scrivener Room (the room has a wavy ceiling which improves the acoustics, but the computer controlled daylight adjusting lights were distracting). This was scheduled to start at 9:30am, but I was 20 minutes late, just as Ben was finishing the introductions. Here is the agenda annotated with my notes:

    Agenda

    1. Introduction, Ben Searle.

    2. Overview of Meeting objectives, Ben Searle: Ben suggested the need for both companies and researchers to be involved. He suggested that open source should be used and that Australia needed to work with international standards.

    This what was on the agenda as Possible Meeting Objectives:
    • Establish a management mechanism - Terms of Reference?
    • Agree on a ‘single point of contact’
    • Determine who wants what and who can contribute resources
    • Establish short and long term needs
    • Determine need for a technical meeting
    • Identify the User Community
    • Identify possible software developers
    • Agree on open source principles
    • Identify possible resources
    Much of what was discussed in the meeting was geospatial specific and my only experience of that was helping with the Sentinel fire tracking system. (including an experimental alternative web interface designed for mobile phones).

    3. Overview of GeoNetwork application, Kate Roberts: Kate talked about the BlueNet MEST project:

    The BlueNet project will establish a national distributed marine science data network linking universities to the AODC, to support the long term data curation requirements, and data access needs of Australia’s marine science researchers.

    BlueNet will build infrastructure to enable the discovery, access and online integration of multi-disciplinary marine science data on a very large scale, to support current and future marine science and climate change research, ecosystem management and government decision making. ...

    From: BlueNet, University of Tasmania, 2007

    OCHA Maps-On-Demand

    BlueNet are using the GeoNetwork open source software. Their system is up and running but most records are not yet available to the public. However, the system has a similar interface to other GeoNetwork implementations, such as the Office for the Coordination of Humanitarian Affairs' (OCHA) "Maps-On-Demand".

    Kate mentioned ISO 19115, the Geographic Information Metadata standard from ISO , the Z39.50 protocol and problems with security got a mention (LDAP seen as the solution). Problems with flexibility of the software for handling XML data and IP of different data sets (including provision for Creative Commons). ePrints also was mentioned. Many of these issues were familiar, particularly how to share information while retaining the owner's rights.

    Kate then gave a demo. Unfortunately the text was so small I could not read any of it, apart from the logo on the top of the screen. Perhaps UTAS needs an web accessibilty course.

    One question asked was how to use the thumbnail images on the right side of the screen. At first these seemed to be purely decorative and so the issue did not seem relevant. But on the next screen it turned out that this was where on the screen commands were displayed.

    The term "clone" was used to indicate "copy". This was confusing and also is potentially emotionally loaded for the general public, with the debate over human cloning.

    A very complex nested metadata form was then shown. This could be useful for metadata experts (and the students I teach metadata to), but will be unusable for the average user. A simpler web search type interface is needed.

    My only quibble with the technical standards is that
    GeoNetwork use of Z39.50 is a bit dated (and something only a librarian could love). Web Services would be a better idea. However, Z39.50 might be needed to interoperate with other repositories.

    The OakLaw project (
    Open Access to Knowledge or OAKL), got a mention as they are looking at adapting Creative Commons for Australia (Prof. Brian Fitzgerald from QUT talked about it at the National Scholarly Communications Forum 2007). It was suggested that the Queensland Government was looking to use CC for government data. This then could be applied to Commonwealth data. The people doing Oak Law at QUT have already produced an Australian version of the Creative Commons license. This would seem to be adequate for use by government agencies (but I am not a lawyer).

    The demo then showed a map of Tasmania, at different resolutions and pop-up windows of data from features on the map. The interface could do with some of the user friendly features of Google Maps.

    In some examples the thumbnails were small maps, which looked useful.

    4. Discussion on Governance Mechanisms and Related Issues, Everybody:

    This what was on the agenda for Governance Mechanisms and Issues:
    • How best to arrange, manage and coordinate our activities?
    • How often do we meet?
    • Do we need a technical group to support a management group?
    • Do we need a single point of contact?
    • Who is best suited, interested and willing to perform this role?
    • Should the point of contact be funded?
    • How do we develop and coordinate specification development?
    • How do we prioritise development activities?
    • Do we allow participation of the commercial sector?
    • Can they assist in they management, development and/or as project participants
    There was a discussion of what mechanism for work for government agencies. This reminded me of the administrative process I helped invent invent for the creation of Internet networks and web sites of the Australian Government. This started out with a self appointed group (the Commonwealth Internet Reference Group formed in 1994) and was later formalized. I suggested a similar strategy with OSDM as the lead agency and endorsement from AGIMO.

    The
    Australian Bureau of Statistics (ABS) is taking a similar approach to statistical data coordination with its National Data Network, as is proposed for geo data. I suggested using the administrative processes and terminology developed by ABS.

    5. Identify Priorities, Everybody:

    This what was on the agenda for Identify Priorities:
    • What are the key technical issue that need resolution first?
    • Do we need to hold a more technical meeting to commence the specification development?
    • Identification of resources including agencies willing to support the development process, funds and other resources
    • Do we need a short term and long term objective and can these be carried out concurrently?
    There was then a discussion of software tools needed. Many of these were geo specific. Some were to do with XML data validation. There was discussion as to who might do software development: government agencies, companies and/or universities. I suggested that ANU students were capable of producing software, but this needed to fit in their educational program, which requires a six or twelve month cycle. It is easier to fit open source prototypes, tools and research into education, rather than proprietary production code.

    Students who have undertaken ANU's "IT in e-Commerce" (COMP3410/COMP6341) and similar university courses, will be familiar with
    XML, XSL, DTD, CSS, knowledge discovery, Web Services, meta-data, web-based data mining, data management, security, encryption, authentication and the like. But they will still struggle with the problems of the politics of data.

    There was discussion of Wikis and mailing lists (the sort of thing used for the ACS Green IT Group is using). There was also a discussion of metadata entry tools.

    There was a healthy skepticism as to the status of international standards. Curiously there was no mention of Standards Australia.

    6. Initial Project Plan and Timeframes, Everybody:

    To be delegated to a meeting of steering and technical committees.

    7. Summary, Ben Searle.

    Additional Information - Possible Areas for Discussion

    Meeting closed at 12:25pm.

    Further thoughts

    This was a very useful meeting, with people expert in the field and from leading organizations. However, a perception that government committees need to work in a particular way seems to be hampering progress. Use could make of Web 2.0 and social networking technology for consulting and coordinating the work. In this way the inertia of conventional committees could be avoided.

    A major problem with efforts such as the Australian GeoNetwork Developers Group is to find who may want to be involved. This can be overcome by placing the information online so that interested people can discover it. The next step can then be taken to invite them to comment and participate. Rather than a rigid plan, anyone interested can be invited to participate, using generally agreed standards and open source systems.

    An example of where a looser method of coordination was used was in the
    web based open source disaster management system for an Indonesian earthquake. Instead of conventional documentation, the Indonesian IT students doing the work convinced me that a Wiki could be used. The result was a more social, human and inclusive document than would be usual for an IT project.

    In Which Repository?

    The approach to metadata and repositories for geodata is much the same as that used for other types of data, such as statistics, documents and cultural records. The geoscience community have much to gain from being able to work with other such communities of interest and much to loose if they do not.

    As an example, the ANU has an electronic repository Demetrius (named after the first Librarian of Alexandria). The holds mostly materials from the humanities, with culturally significant archives, such as photos of pubs of NSW. Geoscience also holds electronic copies of research publications across disciplines. If the Geoscience materials are not visible in the general repository it may never be found by potential users. Policy makers may not not even notice that geoscience is making a useful contribution and therefore not fund it.

    Courses

    ANU is offering courses in its System Approach to Management of Government Information. This was developed with the for the National Archives of Australia for teaching e-document management to public servants. It includes a short version of my metadata/e-repository lectures. This could be expanded to include more scientific aspects of metadata and geodata. That would be much more interesting for the students than learning how to file government paperwork electronically. ;-)

    Smart Rooms

    One aspect of Geoscience is the need to have computerized measurement equipment in the field. Following up on the lunch discussion after the meeting with some of the participants,
    my proposal for a transportable smart room might be useful. As well as being used for school children at remote indigenous communities and for command and control on the new amphibious ships HMAS Canberra and HMAS Adelaide, the technology could be used for geosciece at remote locations, with something more modest than used for arctic research.


    ps: There seems to no ISO 19115 entry in the3 Wikipedia in English. Perhaps Australia could contribute one.

    Labels: , ,