Saturday, February 06, 2010

Government reports as ebooks

One response to my talk on "Making e-Books for e-Learning on i-Pads" at BarCamp Canberra 2010 was from Senator KateLundy. She tweeted: "With so much govt information online, Tom's talk makes me wonder about the merit of publishing public info in ebook formats too". This seems an idea worth investigating.

I have long advocated providing government reports as a set of web pages, rather than as one big PDF file, as is typically done. However, government people are reluctant to do this.

One argument against web pages is that they are more difficult to make, but as I show my web design students, if you take an accessible approach to design, then this is not hard. If the document designer concentrates on making a document people can read online, where most will be read, rather than concentrating on producing a pretty printed report (which hardly anyone will see), then web format is a viable option.

Another argument is that web pages are not legal documents, which I explain to my electronic document students, is not true either. There is a commonly held, but incorrect, assumption that government reports must be in PDF format to stop them being edited. It is more difficult to edit a PDF file than a web page, but not impossible. In any case this is irrelevant to the protection of government reports.

But I suspect the real issue is that a set of web pages do not seem as real as a "book" and does not have the needed look of authority a government report demands. Collecting the web pages up into an ebook format may give them the needed gravitas. This could done with a three step process:
  1. Here is the printed report, see it looks like a proper printed document,
  2. Here is the ebook, see it looks like the printed report,
  3. Here is the web page, see it looks like a chapter from the ebook.
As government agencies are already using content management systems, it should be feasible to support commonly used ebook formats with minimal effort by authors and publishers. The CMS would simply collect up a set of web pages and package them in an ebook format (a simpler system would do the reverse, saving the e-book and unpacking it on request to separate web pages, which might better meet archiving requirements).

As discussed in my talk on "Making e-Books for e-Learning on i-Pads", the obvious e-book format to use is EPUB. This is based on XHTML and CCS as used by government web sites. It is also being popularised as a format by support on the Apple iPad. EPUB requires some extra XML files, but these supply information which agencies are required to provide anyway and should already have in their systems.

Convincing agencies to use an ebook format should be a lot easier than convincing them to use accessible web pages. Instead of having to explain why a whole lot of decorative junk is not a good idea and that instead information should be clearly and simply, it will be just a matter of saying "yest, that is a wonderful animated app, but unfortunately the ebook format does not support it".

There will be some inefficiencies, as ebooks are designed to be standalone. Therefore the CSS, logos and "about us" text which can be shared between web pages (and automatically inserted as required by a CMS) will have to be duplicated in each ebook. However, this duplication already occurs with PDF versions of reports, where fonts also contribute to the size of the resulting files.

Ebooks should also make archivists happy as they include their own metadata. In fact ebooks are conceptually similar to the archiving techniques used electronic archiving systems, which wrap up all the associated files of an e-document along with an XML encoded set of metadata.

The public could still read an individual chapter of a report as an ordinary web page. The system could also still provide automatically generated PDF, if anyone wants it. But if the web version is offered first in the list of options online, I suspect most people will be happy to download a few dozen kilobytes of the summary of a report, rather than megabytes of the full report in PDF. I might try out the idea with my students this year and see if the practice then diffuses into the Australian government.

Labels: , , , , , , , , ,

Tuesday, February 02, 2010

ODF Format for Danish Government Documents

According to Heise Media UK Ltd the Danish Parliament has agreed to use the Open Docuemnt Format (ODF) format for government documents, in preference to Microsoft's Office Open XML (OOXML format). says that this will not come into effect until April 2011 and that PDF/A (the archival version of PDF) will also be used. There was some detailed and technical discussion in the Danish Parliament as to the compatibility of different versions of document standards. There is also a detailed report available on the problems of converting between the formats: "Document Interoperability: Open Document Format and Office Open XML " (Dr. Klaus-Peter Eckert, Jan Henrik Ziesing and Ucheoma Ishionwu, August 2009). This doesn't really say anything not known from previous comparisons: simple documents convert but there are some problems with more complex ones.

Labels: , , ,

Tuesday, December 08, 2009

Google Apject Takeover

Greeting from the Sydney Wave User Group meeting. The first topic is the takeover of Apject by Google. There was some concern by the user community that EtherPad's non-Google implementation would be discontinued. In response at short notice a decision was made to provide it as open source.

Pamela Fox from Google gave a quick introduction to Wave, explained as three components as a product, protocol and a platform. The product is Google's web based implementation. The protocol includes a way of having transactions and a data model, this allows companies other than Google to implement Wave. The Platforms include Robots, Extensions and Gadgets. Robots are computer applications which can participate in the conversation, assisting human participants. Gadgets are small applications which extend the wave interface. Extensions allow Wave content to be integrated with other systems.

Wave uses XML for storing and communicating. Robots operate on the XML "Wavelets. Each Wavelet has participants, a title and one or more "blips", which represent the atomic components of a conversation. The blips can be manipulated by Robots.

The Blips are text, annotations, and elements. Annotations are similar to HTML markup. Elements are added items such as a map. As these components are kept separate this allows easier manipulation of the information. In a way the Wave way of doing markup is analogous to Wiki text and designed to make changes, particularly by multiple users, easier. This allows, for example, chnages in real time to the mark-up of a document. This is one of the features which Google find "cool", but I find confusing.

Robots react to events and perform operations. Each robot has a URL which is sent events and performs operations on them. The state of the gadget is stored in the XML sent to the gadget and is therefore not secure. Also there are all the usual problems of real time and multiple updates of the same data. An example of a robot is one which goes through waves converting web addresses to links.

ps: This is about the third Google Wave event I have attended over the last six months. The first glimmerings of understanding of what Google Wave is are now starting to form in my mind. This is three times as long as it took me to get "The Web". I remain to be convinced that Wave is the next big thing or is significant at all. The Web turned out to be based on a few simple ideas and lots of reuse of existing technology. I am not sure Wave could be similarly decomposed.

Labels: ,

Monday, December 07, 2009

Google Wave Real-time Document Collaboration

According to news reports, Google has acquired AppJet so that its EtherPad real-time document collaboration tool can be incorporated in Google Wave. I was not impressed with Google Wave's interface and offering EtherPad's fmailar wordprocessor like interface on top will be an improvement. The idea is to allow several eople to edit the same document at the same time and see the changes each other are making in real time. I will be incuding some of this in my new ANU course "Electronic Data Management" (COMP7420).

Labels: , , , , ,

Friday, August 14, 2009

Future of Scholarly Communication

Greetings from the National Library of Australia in Canberra, where Dr David Prosser, Director of SPARC Europe is speaking on "Open Access and the Future of Scholarly Communication: Dissemination, Prestige, and Impact". He started by talking about the political imperative for access to information, both as a right and as a way to drive the economy. Governments which fund research are demanding measures of results, which provides an impetus for open access to increase use of research output, with e-science and e-research.

Dr Prosser pointed out that the revolution of the Internet is real, with 90% of scholarly journals online. The problem is that the new technology is matched with an old business model of subscription access. In some cases, access to one paper might cost several thousand dollars, even when the author of the paper gave away their copyright for free. He talked about how the traditional paper journal is a bundle of services which can be unbundled in the online environment. He jokingly congratulated Australia for no signing the "Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities", as those who sign tend to feel they need to do no more. Examples of successful open access polices were the Welcome Trust and US NIH (Professor Larry Clarke from NIH is a keynote speaker next week at HIC09 in Canberra).

Dr Prosser criticised the Australian Research Council for not requiring open access to funded research. That policy followed a submission to the ARC by Professor Arthur Sale FACS, which I signed on behalf of the ACS (along with other organisations).

Dr Prosser speculated on new forms of scholarly publishing online, with institutional repositories being used as a source and forum, more closely integrated to research. He used NanoHub as an example. He used the analogy where academic libraries have now integrated teaching spaces (learning commons).

For the future Dr Prosser asked if papers should be designed to be machine readable, rather than human readable. He asked if they should be static, or can they be updated as new results become avialable. He asked if wikis and blogs have any long term academic value.

There is a paper "Institutional repositories and Open Access: The future of scholarly communication" (Journal of Information Services and Use, 2003) and a copy of an older presentation by Daid available online, covering many of the topics in his current Australian presentation "Open Access and the Evolving Scholarly Communication Environment":


Open Access and the Evolving Scholarly Communication Environment

David Prosser • SPARC Europe Director


SPARC Europe

Scholarly Publishing &
Academic Resources Coalition

  • Formed in 2002 following the success of SPARC (launched in 1998 by the US Association of Research Libraries)
  • Encourages partnership between libraries, academics, societies and responsible publishers
  • Originally focused on STM, but coverage expanding
  • Has over 110 members in 14 countries
  • By acting together the members can influence the future of scholarly publishing

The Effect of the Internet

  • Opportunities for expanded access and new uses offered by
    • ever-expanding networking
    • evolving digital publishing technologies and business models
  • New dissemination methods
  • Better ways to handle increasing volume of research generated
  • 90% of journals now online

The Situation Today – Dissatisfaction at Many Levels

  • Authors
    • Their work is not seen by all their peers – they do not get the recognition they desire
    • Despite the fact they often have to pay page charges, colour figure charges, reprint charges, etc.
    • Often the rights they have given up in exchange for publication mean there are things that they cannot do with their own work
  • Readers
    • They cannot view all the research literature they need – they are less effective
  • Libraries
    • Even libraries at the wealthiest institutions cannot satisfy the information needs of their users
  • Funders
    • Want to see greater returns on their research investment
  • Society
    • We all lose out if the communication channels are not optimal.

Open Access

What is it?

Call for free, unrestricted access on the public internet to the literature that scholars give to the world without expectation of payment.


Widen dissemination, accelerate research, enrich education, share learning among rich & poor nations, enhance return on taxpayer investment in research.


Use existing funds to pay for dissemination, not access.

Budapest Open Access Initiative

Two complementary strategies:

  • Self-Archiving: Scholars should be able to deposit their refereed journal articles in open electronic archives which conform to Open Archives Initiative standards
  • Open-Access Journals: Journals will not charge subscriptions or fees for online access. Instead, they should look to other sources to fund peer-review and publication (e.g., publication charges)

What are Institutional Repositories (Open Archives)?

Essential elements

  • Institutionally defined: Content generated by institutional community
  • Scholarly content: preprints and working papers, published articles, enduring teaching materials, student theses, data-sets, etc.
  • Cumulative & perpetual: preserve ongoing access to material
  • Interoperable & open access: free, online, global

The Benefits of Institutional Repositories

  • For the Individual
    • Provide a central archive of their work
    • Improved discovery and retrieval
    • Increase the dissemination and impact of their research
    • Acts as a full CV
  • For the Institution
    • Increases visibility and prestige
    • Acts as an advertisement to funding sources, potential new faculty and students, etc.
    • Helps in administration, e.g., Research assessment and evaluation
  • For Society
    • Provide access to the world’s research
    • Ensures long-term preservation of institutes’ academic output

What is a Journal?

Scholarly publishing comprises four functions:

Current model:

  • Integrates these functions in journals
  • This made sense in print environment




for future use




of research


Certifying the


of the research





The Four Functions - Repositories




for future use




of research


Certifying the


of the research








  • Certification gives:
    • Authors – Validation of their work (important for promotion and grant applications)
    • Readers – Quality filter
  • Journals provide peer review and give a ‘quality stamp’ to research and authors
  • Journals should be open access

The Four Functions of a Journal




for future use




of research


Certifying the


of the research







Open Access


How the pieces work together









Interoperability Standards


e.g.: by



e.g.: peer review


e.g.: search tools, linking


e.g.: by library

Theory Into Practice
- Institutional Repositories

  • GNU EPrints – Southampton
  • D-Space – MIT
  • CDSWare – CERN
  • ARNO – Tilburg, Amsterdam, Twente
  • Fedora – Cornell University / University of Virginia

  • DARE – The Netherlands

Theory Into Practice
- Institutional Repositories

OpenDOAR (Directory of Open Access Repositories)

  • An authoritative directory of academic open access repositories
  • List of over 1425 repositories
  • Can be used to search across content in all listed repositories
  • Gives information on repository polices (copyright, re-used of material, preservation, etc.)

Theory Into Practice
- Open Access Journals

  • Lund Directory of Open Access Journals ( – lists over 4250 peer-reviewed open access journals
  • PLoS Biology (launched 2003), PLoS Medicine (2004), PLoS Computational Biology, PLoS Genetics, PLoS Pathogens (2005)
  • BioMed Central (published over 54,000 papers)
  • Documenta Mathematica (Ranked 24th of 214 mathematics journals listed by ISI)
  • SPARC Europe has helped to launch the Open Access Scholarly Publishers Association (OASPA - to represent the interests of open access publishers

Open Access – Making the Transition

  • Give Authors the choice:
    • If they pay a publication charge the paper is made open access on publication.
    • If they do not pay the publication charge the paper is only made available to subscribers.
  • Over time, as proportion of authors who pay increases subscription prices can fall
  • Eventually, entire journal is open access

Open Access – Making the Transition

  • A number of ‘traditional’ publishers are transforming their closed access journals into open access journals:
    • Proceedings of the National Academies of Science (PNAS)
    • Oxford University Press
    • American Institute of Physics
    • Company of Biologists
    • American Physiological Society
    • American Society of Limnology and Oceanography
    • Springer
    • Blackwell’s

The Power of Open Access – Self Archiving

  • For 72% of papers published in the Astrophysical Journal free versions of the paper are available (mainly through ArXiv)
  • These 72% of papers are, on average, cited twice as often as the remaining 28% that do not have free versions.

Figures from Greg Schwarz

  • Tim Brody from Southampton has shown that papers for which there is also a free version available have, on average, greater citations than those that are only available through subscriptions

The Power of Open Access – Journals

  • Open access PNAS papers have 50% more full-text downloads than non-open access papers

  • …and are on average twice as likely to be cited

What Institutions Are Doing


    • Set-up and maintain institutional repository.
    • Help faculty deposit their research papers, new & old, digitizing if necessary.
    • Implement open-access policies

Open-access journals:

    • Help promote open access journals launched at their institution become known externally.
    • Ensure scholars at their institution know how to find open access journals and archives in their fields.
    • Support open access journal ‘institutional memberships’ (e.g. BioMedCentral, PLoS)
    • Engage with politicians and funding bodies to raise the issue of open access

Open Access – Appealing to All the Major Stakeholders

  • To the funders of researcher – both as a public service and as an increased return on their investment in research
  • To the authors – as it gives wider dissemination and impact
  • To readers – as it gives them access to all primary literature, making the most important ‘research tool’ more powerful
  • To editors and reviewers – as they feel their work is more valued

Open Access – Appealing to All the Major Stakeholders

  • To the libraries – as it allows them to meet the information needs of their users
  • To the institutions – as it increases their presence and prestige
  • To small and society publishers – as it gives them a survival strategy and fits with their central remit

A Changing Environment

“It is one of the noblest duties of a university to advance knowledge, and to diffuse it not merely among those who can attend the daily lectures--but far and wide. ”

Daniel Coit Gilman, First President, Johns Hopkins University, 1878 (on the university press)

An old tradition and a new technology have converged to make possible an unprecedented public good.

Budapest Open Access Initiative, Feb. 14, 2002

Labels: , , , ,

Wednesday, August 12, 2009

Senator Lundy describes her Public Sphere initiative

A ten minute video "Senator Lundy describes her Public Sphere initiative" is now available. This was made for my students at ANU studying Information Technology in Electronic Commerce COMP3410. For an assignment the students have to work out what metadata is appropriate to support such public discussions and to archive video used in policy making.

Public Sphere uses a mix of blogs, wikis, instant messaging, video and other tools, in different formats on different systems. This is fine for a pilot, but if this approach is to be used routinely for public policy making, then a system which allows easier set-up, use and archiving all of the material (perhaps for hundreds of years) is needed.

Government 2.0 Taskforce - Road Show starts consulting the public on 17 August in Canberra, followed by other locations around Australia, ending in Darwin on 2nd September 2009. The Taskforce has 15 experts chaired by Nicolas Gluin and was announced at Public Sphere 2 (Video & transcript available).

The open source XENA tool ("XML Electronic Normalising of Archives") which National Archives of Australia use can then be modified to convert discussions and video from public policy events into a long term storage formats.

Labels: , , , , ,

Monday, August 03, 2009

Future of Scholarly Communication in Europe

Dr David Prosser, Director of SPARC Europe will speak on "Open Access and the Future of Scholarly Communication: Dissemination, Prestige, and Impact", at the National Library of Australia in Canberra, 14 August 2009.
ANU Division of Information and the National Library of Australia Present:

Open Access and the Future of Schollarly Communcation: Dissemination, Prestige, and Impact

Dr David Prosser
Director, SPARC Europe

Friday 14 August, 12.30-1.30pm
Conference Room, 4th Floor, National Library of Australia
Parkes Place, Canberra, ACT

This lecture is free and open to the public.

Enquiries: T: 02 6125 2981 E:

The internet is having a profound impact on the 300-year-old model of scholarly communication. New technologies allow for new modes of interaction between researchers, and a wider audience of administrators, funders, governments and the general public. The lines between formal and informal communication are becoming increasingly blurred and
publishers and librarians find themselves playing new roles in the scholarly communication chain. One of the most powerful new ideas to emerge with the development of the internet is open access – the notion that the scholarly research literature should be made available
to readers free of charge. This presentation describes current developments within the scholarly communications landscape and provides an indicator of possible future directions.

David Prosser was appointed the first director of SPARC Europe in October 2002. Previously, he spent ten years in science, technical, and medical journal publishing for both Oxford University Press and Elsevier Science. During this time he was involved in all aspects of publishing from production through to editorial and financial management of journals.

Before becoming a publisher he received a PhD and BSc in Physics from Leeds University,UK.

SPARC Europe is an alliance of European research libraries, library organizations and research institutions, providing a voice for the community and the support and tools it needs in order to bring about positive change to the system of scholarly communications.

Its members represent over 100 leading academic and research institutions in over 14 European countries. ...

From: Open Access and the Future of Scholarly Communication: Dissemination, Prestige, and Impact, ANU, 2009

Labels: , , , ,

Tuesday, June 23, 2009

Extending university events into the online world

The Australian National University College of Engineering and Computer Science held a Poster Day on the topic of "Connecting Research to Business" on 22 June 200 in Canberra. Unfortunately I could not attend as I was at Parliament Hose talking about Government 2.0. It occurred to me that it would be useful to extend the poster day into the online world, so as to allow those who could not be there in person to take part. I suggested this to the College and have been invited to put some ideas on how to do it, with the easiest options first. Some thoughts on this follow and suggestions would be welcome.

In the case of the CECS day, the posters are PDF files, designed to be printed as one page AO size documents in landscape layout. It is intended that the poster is placed on a wall and read from a distance of about one to two metres. The author(s) of the poster stand nearby, give a brief presentation and answer questions.

Put the Posters Online

The first obvious suggestion is to provide the posters online. This could be done, after on, or preferably before, the poster day. This would allow those attending to study the posters in more detail later, allow those planning to attend to preview the work and for those not attending. It would also provide a permanent archive of the work and would allow those interested in a topic to find the information (and CECS) using a web search.

Provide a directory of the posters

In addition to the individual posters a directory of the posters would be of use. This would have the title and author of each poster, and perhaps a one paragraph summary, with a hypertext link on each title to the poster. The template for posters should include a hypertext link back to the directory.

The directory could be done as a PDF document, but would be better as a simple web page in HTML. The document can be designed using CSS media types so that when printed it can be used as the directory of the posters on the day.

Provide Posters Which Can Be Read Online

Posters designed for printing A0 size are not be easy to read online. A desktop computer screen is the equivalent to about one A4 page. An A0 page is sixteen times the size of an A4 page and so only a limited amount of the content will fit on the screen. Also if printed at A4, the poster will be unreadable. As an example the Example poster provided by CECS has text too small to read when displayed on a desktop 15 inch screen and when printed on an A4 page.

The poster content should be formatted so that it will display on a computer screen and print on A4 pages, in a readable format, as well as A0. This can be done by formatting the poster using the reflow option in PDF, or preferably using fluid web page design. In this way, when displayed on screen, the content will reformat to fit the smaller space automatically.

PD has an option to "reflow" the content of a page to automatically fit the display screen. However, this option is not available in older versions of PDF viewers and does not work correctly with some later versions. CSS fluid formatting in HTML will produce more reliable reflowing of a web page than PDF.

Add hypertext links

References in posters can by hypertext linked to related documents. These links can be suppressed so they do not display in the printed version using CSS media types.

Add additional material

An audio or video description of the poster can be offered to accompany the online version. This can be simply a recording of the presentation the author gives for the poster. The digital audio or video file can be provided in a hypertext link in the directory and/or the poster. There is no need to provide automated playing of the recording, nor synchronisation with the display of the poster, just a link will do.

Invite Comments Online

Online comments and questions can be invited for a period, before, during or after the poster day. This can be managed using a forum tool, such as that provided in the Moodle Learning Management System used by ANU for courses, or OJS as supported by CECS for IFIP publications. These tools can also be used to manage the soliciting for and submission of the posters, and to publish them.

Stream Poster sessions

Presentations of the posters can be streamed live via the web with audio, video or web casting. However, this requires considerable preparation and planning.

Labels: , ,

Wednesday, June 17, 2009

Role of the web in bushfire warnings

The 2009 Victorian Brushfires Royal Commission is addressing the issue of the role of the web in providing warning to the public. Professor John Handmer, author of "Handbook of Disaster and Emergency Policies and Institutions", gave evidence on 16 June 2009. The statement is not yet online (the commission secretariat told me they have some "technological issues" with statements at present), but the Transcript of Proceedings is. Below are some excerpts dealing with the web and Internet. I agree with the general approach suggested by Professor Handmer, but would like to see simple efficient web mark-up used for warnings, rather than plain text.
You note in paragraph 16 that the audience for a warning may be hugely variable and towards the end of that paragraph you note that, "People go to different sources. Some community members may be habitual uses of the internet, others might be more likely to turn to the radio, others might use personal networks. There are different preferred modes of receiving information." How does that then impact on the way that one should take care to disseminate warnings?---Ideally - I mean the community at risk is infinitely diverse. Each individual, we could argue, has a unique preferred way of receiving a warning, but at some level we have to stop, I suppose. But ideally the modes that are the preferred ways for that community at risk to receive their information should be the modes that are used, given whatever is practical, and that means, almost always it means that there would be several modes.

So it would be preferable in your view to use the internet as well as ABC Radio and perhaps even give consideration to other modes like phone calls or Twitter sites?---Yes, that's right. They are all reasonably technological means. One could argue that in many communities to ensure that the more vulnerable people - it depends on the community - are reached, we would probably need to get into the local networks, the personal networks or the community networks to try to activate, if you like, the neighbourhood to make sure that people who may not receive warnings via those modes receive them either by direct personal contact or some other way, and that they make sure that they are in a position to take what sort of protective action is needed. But this is tapping into what we call the informal warning system. Is there another benefit to disseminating by more than one means, namely in case of failure of one means or imperfect delivery of one means during a crisis?---That's right. We would argue that reliance on any single mode of dissemination is pretty risky, partly because it is not going to get to everybody no matter what it is and, secondly, any single mode is subject to failure or congestion or interruption.

The next aspect you turn to in your statement is timeliness and you note in paragraph 17, "A warning should be delivered in a timely manner so as to allow people to confirm what they have to do and take action in time." Is that a feature you have noticed in your research, that people usually seek confirmation from further sources before they act?---There are two things that come out of the research, main things. One is what you have just said, that people will almost always seek confirmation. Officials will, too. But people at risk will seek confirmation usually by mobilising their personal networks or if they hear something, read something on the web, listen to the radio or TV or ring somebody or vice versa. This is pretty normal and we have found often people - they also might want to ascertain the location of other household members. There are a number of things go on typically before people take action. The other thing we have noticed is that very frequently people receive the warning or at least understand that the warning is important to them too late to do anything useful. ...

Websites. Can we go to question 5, which starts at page 0018, and you note in paragraph 67 that web-based material has really become the primary source of information in our society. In paragraph 69 you make some points about who uses the internet. You say that even though it seems ubiquitous, in 2006 about a quarter of Victorians didn't have internet access. So, although that is a declining proportion, that needs to be kept in mind. That comes from the census data, is that right?---That's right. So it remains the case that the web is not a fix all. One would need to keep in mind promoting messages through ABC Radio and other means?---That's right. The point there is that a proportion of households, and they are likely to be people who are more vulnerable, elderly people and so on, do not have web access. It is also an interesting thing that people who promote the web as a vehicle for warnings have an implicit assumption that people are out there actively seeking their warnings on the web. We don't have evidence for that.

That's an important point you make at point 3: "Websites offer a passive form of warning. That is, they don't alert you to come and read them, although you will find the message if you go and look for it"?---That's true. There are a variety of ways of overcoming that and making websites active through all kinds of tools that can send the messages to you now, Widgets, Twitter and so on. But, nevertheless, the basic principle is that a website is a passive form of warning.

It could be used in conjunction, though, couldn't it, with those other tools you mentioned. If there was a SEWS signal played on the radio or an automated phone call or a text message, part of which suggested looking at a website, that might combine the call to action with finding more information on the website?---It could, or it could simply be that the material on the website is sent to your mobile phone or whatever by one of these devices and there are several possibilities with that.

You note over the page on 0019 some issues about currency and reliability and the issues which may arise when a website is under heavy demand. We touched on this when you spoke of your own experience on 7 February. Is there a way to address the situation when websites are under heavy demand and therefore slow down or even become inaccessible?---They tend to slow right down, that's right. There are a number of ways of addressing it.

Probably the simplest way is for people to take the information off the site automatically and feed it onto other sites or other systems. In the fires on February 7th the material from the CFA site was re-posted, if you like, via Twitter. There was an unofficial site, CFA updates, which was a Twitter site, and that is still active, actually. That was one of a number of sites that on the day took material unofficially from the site. There is a way of doing it which is quite legitimate and CFA encourage it. So, that's one way. What that does is take the load off the site. Another way is to ask people not to use it or to restrict access, but that doesn't seem very promising to me, given that we actually want people to use it, but that's a standard response. Otherwise, there are a number of technical ways of doing this which I outline in the paper. They are basically about reducing the degree of interactivity with the site, so that when you go into the site you don't actually - what you get is just sitting there. The amount of processing power that site needs to use is limited one way or another. Things like graphics, logos and so on, which we have more and more of them on our sites, are pretty hungry for memory.

The idea is not to use them in these emergency situations. In one sense it is an argument for moving to a different website mode in a major emergency when you know the demand is going to be great. I don't know whether I mention it here, but after the tsunami the British Commonwealth and Foreign Office or Foreign and Commonwealth Office website on travel advisories and so on switched to a text only mode for precisely this reason.

And that reduces the memory use?---That's right. It can handle a lot more inquiries.

I note in paragraph 72 you suggest, if we just deal with websites bit by bit, you suggest first of all that it would be useful for there to be one website rather than the DSE and the CFA websites?---A lot of people are arguing this, that there should be one website, but it is a trade-off, I want to say, as well, because if there is one website, all the problems we are talking about in terms of website overload and so on are exacerbated. The solution of course is that there are two sites but they mirror each other's content.

So two sites with the same content or multiple sites with the same content may help?---Yes. I think a single site in terms of content is the ideal, but if we look at the practicalities and the reliability, we are much better off having a number of sites.

Is there also potential to enable information within a website to be hived off, namely to enable people to look at particular messages pertaining to particular parts of Victoria so that they are using different pages or different information at the one time?---Yes, there are a range of devices and so on that can be embedded in sites to do that, and even to send them to the people concerned. You set out all these matters working through to paragraph 80 in the statement. Paragraph 77 is where you deal with the RSS feed. This is the capacity you spoke of for the material on an internet site to be mirrored, if you like, over on a Twitter site?---Yes, but not quite. The RSS feeds really just take key information. They don't take the whole information of the site. That is one reason why they can actually feed information on to sites like

Twitter or even mobile phones if the system is enabled. They take headliners, basically.

Dealing with sirens, which is question 6 - - -

COMMISSIONER PASCOE: Before we leave the websites, a question about the Bureau of Meteorology site which had, we are told, 70 million hits on the day and is used to having a massive - - -?---It is the most popular in Australia, I think, the most popular government site.

I don't know whether you have looked at the features of that site and what enables that site to cope with the heavy demand vis-a-vis the sites that we have just been talking about and whether there are any lessons we can learn from the bureau website?---I'm sure there are, but I haven't personally investigated them, but a lot of the bureau's material is in very basic text form and I think that's probably one of the key features of enabling that site to handle such loads. But I think that would be a worthwhile. I think it is the fourth most popular site in the country. ...

Turning to new technology, question 7, this is a matter you discuss in paragraphs 91 onwards and you refer to the new technologies which have emerged. You make the point in paragraph 93 it is important not to overlook our longstanding communication technologies, including radio. In paragraph 95 you say that it is important to distinguish between new technologies that deal with the centralised systems, such as CAP, and those that relate to individualised information. I take it from what you say here there is certainly a role for new technologies to play and it is a field that continues to develop?---I think the new technologies, in terms of delivering a message, as we were discussing, to the people at risk, have only very recently started to play a major role, but it has been quite quick and now most people in our society, I would say the majority of people by far use either a mobile phone, text, are very familiar with texting and the internet as their normal means of gaining and sending information or whatever. So we have to use them if we want to reach particular audiences and there are many variations of those modes.

Because you mention in paragraph 98 Facebook sites that are mostly post-fire, but Facebook sites, MySpace sites and in paragraph 99 the Twitter site as new technologies being used by portions of the community that ought not be overlooked?---That's right. Some of these played a role, like Twitter sites, in warnings. There is anecdotal evidence that people got warnings on Facebook because they were looking at some aspect of Facebook and suddenly some message came across. But people weren't using Facebook, as far as I can see, for warning purposes but it fulfilled that role.

At paragraph 100 you refer to phones and mobile phones and you make the point obviously they are very familiar. For landline phones, about halfway through paragraph 100, you note the technology which enables locations connected to landlines to be selected which could be used to delimit areas. That might be useful, for example, in any automated phone warning system?---Yes. That's the idea, yes.

You point out the advantages, but also the disadvantages. There may be lack of mobile phone coverage, there may be issues with phone traffic?---And there is a privacy issue with unlisted numbers and so on. But, yes.

Are you familiar with the recent announcement by the Commonwealth government to now establish a national phone automated warning system?---Yes, I am familiar with that. You refer to the common alerting protocol. It, as you mention there, is really a mode of standardising the content of warnings to ensure that it is the same over different modes of dissemination?---Yes. The common alerting protocol relates to what we were discussing a while ago, the write-it-once concept. As you say, it is a standardised message, it has a standardised format and then the idea is that this message can then be disseminated over any number of digital modes. So it has that advantage of speed and also has advantages in being able to go on multiple modes that perhaps would have to be manually uploaded in the past. ...

From: Transcript of Proceedings , 2009 Victorian Brushfires Royal Commission , TUESDAY 16 JUNE 2009, 24th day of hearing

Labels: , , , , ,

Wednesday, February 18, 2009

Learning e-learning

During the Green ICT Symposium today I demonstrated the ANU's new Wattle e-learning system. This uses Moodle and I was able to show courseware using Senator Lundy's Blackberry smartphone.

An odd little quirk I have discovered that the ANU's "Policy: Determination of Systems and Consultation on Assessment" requires the proposed assessment system for each course to be made available to prospective and enrolled students: "... both in hard copy and in electronic form". It seems a little odd if the student is enrolled in an online course, which they can do from the other side of the world, that the ANU insists on providing them a sheet of paper about the assessment. It would seem to make more sense to use the same electronic means as used to deliver the course.

There is no legal obligation to provide the material printed on paper (the High Court recognised that electronic documents are legal some years ago). So I have suggested the ANU Registrar have this deleted from the policy.

Labels: , , ,

Tuesday, February 17, 2009

Federal Court Guidelines on e-Discovery

The Federal Court of Australia issued "The use of technology in the management of discovery and the conduct of litigation" 29 January 2009 (Practice Note No 17). Justice Teague might also consider these of use for his royal commission into the Victorian brushfires. The guidelines set out the use of electronic documents in court proceedings. This is intended to be used a significant number of the documents in a case are electronic (usually 200 or more) and so handling them electronically will speed up the process and lower costs.

The Practice Note cites document provided on the court web site for:
  1. Default Document Management Protocol for 200 to 5,000 e-documents,
  2. Advanced Document Management Protocol for more that 5,000 documents,
  3. Pre-Discovery Conference Checklist
  4. Pre-Trial Checklist
  5. Glossary

Labels: , , , ,

Thursday, February 12, 2009

Free computers for bushfire victims

The Australian Information Industry Association (AIIA), with Computers Off Australia and the Australian Computer Society are coordinating the provision of free computers to Victorian bushfire areas. ICT companies have been asked to register support the AIIA website.

Unfortunately when I went to register my company, the site returned "404 Not Found". So one service I could provide is web site testing. I designed the AIIA's first web site, while I was on a course at the Melbourne Business School in 1995.

More seriously I would like to offer the people of Victoria what assistance I can with dealing with this and future emergencies:
There may also be some scope for modular computer equipped classrooms to be deployed to replace burnt out schools, libraries and other facilities.

Labels: , , , , , , ,

ICT at Victorian February 2009 Bushfire Royal Commission

On 9 February the Premier of Victorian announced a Royal Commission into the weekend bushfires. The inquiry will be a large undertaking. As with the ACT Coroner's investigation into the 2003 Canberra Firestorm, computer based systems are likely to be extensively used in the investigation.

ACT Coroner's court bushfire setup

The ACT Coroner's court was equipped with about fifteen large LCD screens, on the desks for the legal teams, for the Coroner and for the witness. The screens display electronic documents in evidence and a continuous transcript of what is being said. There was an operator at the front controlled what documents were displayed and had an electronic document camera to scan new documents.

While most screens were displaying the same evidence, individuals could use a web browser to view other documents and carry out searches. Large wall mounted flat screens displaying the same electronic documents as on the LCD screens. A video monitor showed what cameras were recording of the Coroner, the witness, the general room and the document display. At the back of the room were two people monitoring the video, audio and text recording.

There were problems with the room layout. The large LCD screens blocked sightlines to the bench and witness. The wall mounted screens were not readable from the back of the room.

There were microphones at each position. However, a witness reported having difficulty hearing what was being asked, as did the observers. The screen on the witness stand was at the side, so when asked to examine a document the witness had to turn away from the room and towards the side wall. In this position they could not see the person asking the question, nor could the microphone pick up their answer.

It appears that documents were scanned in from paper originals. This worked well for text, but not maps (important for an inquiry about where the fire and the firefighters were). There was a paper colour map at the back of the room, but the electronic one used appeared to be a monochrome scan of an A4 page. The maps had been scanned at too low resolution, so they could not be digitally enlarged.

The process for calling up documents was cumbersome: the person asking the questions was working from paper notes and had to ask for a particular document to be displayed on screen. This involved reading out a long reference number to be transcribed by the operator at the front of the room. So there were delays in getting the right document up.

The adversarial nature of the inquiry process also resulted in the technology not being able to be used fully. The evidence had to wait for a verbal question and answer process. Legal objections result in a delay in the process while the possible consequences of a question yet to be asked might be. An online system could be used to greatly speed the inquiry by allowing much of the process to be carried out without the parties being physically present in one room or at the same time.

Where the hearing room is still used, the process might benefit from the use of an interactive electronic whiteboard. This could be used to display and directly interact with the evidence, particularly maps. Witnesses could point to locations on the map and where they were pointing electronically recorded. This would cut out the time wasting process of someone verbally describing a map location, them responding and then the questioner trying to interpret what they said verbally for the record.

UN Oil For Food Program Inquiry

The computerised hearing room used for the 2005 inquiry into the UN Oil For Food Program also provides some lessons. The inquiry used a similar hearing room arrangement to the ACT bushfire inquiry. There were approximately 20 desktop computers and laptops in the room. Two wall mounted projection screens (with projectors ceiling mounted) were used to display evidence to the observers.

The commissioner has three screens on his bench: one at the front and one on each side. This maked it convenient to see a screen, but resulted in the lawyers and observers in the body of the room being unable to see the commissioner much of the time. Similarly it must have been difficult for the commissioner to see the lawyers and observers. Screens placed lower could be used. Also Teleprompter screens, as used for speeches, may be useful in this application. These would have an LCD display flat on the desk, with a transparent screen reflecting the image to the user. The room would be able to see the commissioner through the screen.

The witness had a screen in front of them, overcoming the problem which the ACT Coroner's Bushfire Inquiry had where the witness had to turn away from the screen displaying the evidence in order to answer questions from the Coroner.

Exhibits from the inquiry are available in electronic format via the inquiry's web site. This included printed paper documents, handwritten notes and drawings which have been scanned, as well as email messages. Much of the inquiry depended on this information, with witnesses being asked "Did you read this email or not?" and "You were provided with the web address were you not?".

Each exhibit appears was in the form of a single PDF file. Documents scanned from paper have a barcode sticker on them with a reference number and a barcode. Email messages, such as EXH_0305 AWB.5020.0262 have a reference number in the header and footer. Some email messages have been scanned from paper, as indicated by the bit mapped text of hand written notations, binder holes and skewed text. However, most are from a digital source, as indicated by character encoding (rather than bit mapped image). The PDF document properties indicate that Acrobat Distiller (6.0.1) was used to create the email files.

The email documents are inefficiently encoded. As an example a 3kbyte email message is stored as a 12 kbyte PDF document, with most space taken up by embedded fonts. No embedded fonts should be needed, as the text of the email messages could be displayed using the inbuilt Courier font in PDF.

In some ways the storage of the email messages is too good. The PDF is hyperlinked with email addresses. Placing the cursor over an email address in a document and clicking will result in a mail message being created addressed to that person. This could cause inconvenience and embarrassment to both the sender and the recipient and would be better if this was disabled.

While digital copies of mail messages are provided, they appear to have been edited. Only the From, Sent, To and Subject header fields of the message are provided. Other fields normally included with a message, particularly Message-ID, References and Received, are not provided. These could be useful in checking the authenticity of the messages and how complete the set of messages is provided. As an example the Reference field could be used to help verify that one mail message is actually a reply to another.

The scanned documents are relatively efficiently encoded as black and white (two color) bitmaps at 300 DPI. However, some cases there are coloured covers on documents where text is barely discernible.

Volume of Evidence

One issue which should inquiry brings out is the volume of material which someone may receive in a working day and how much they can reasonably read. At such an inquiry a witness may be asked if they had read an email message addressed to them. Given the volume of messages someone would receive in a day, it would be reasonable to say that you were not sure you had read the whole of a message, if if the system indicated you had opened it. But this sounds evasive in a court-like setting. Perhaps future email systems will record what parts of a message the user accessed and how long for.

Role of ICT in Emergency Management

One topic for the inquiry will be how effectively was ICT used and could be used in the future. Of relevance may be:

Labels: , , , ,

Tuesday, January 13, 2009

Specification of an Enterprise Content Management System

The Australian War Memorial has issued a Request for Tender for ECM digital storage. Contrary to what the title suggests, this is for software, not hardware. The AWM tender provides a very good description of what is required for an Enterprise Content Management System:
A.1 Background

The Australian War Memorial (the Memorial) has recognised the increasing need to manage and deliver complex digital content within its collection as part of its mission to commemorate the sacrifice of those Australians who have died in war. The rapid increase in demands for digital storage and management capacity have, together with preservation and other management needs, resulted in a project to address Enterprise Content Management (ECM). The Memorial has previously concluded a tender process which selected Alphawest to implement an ECM solution based on the Interwoven suite of software products. This tender is for the supply of a digital storage solution (comprising storage hardware, associated storage management software, backup software, and implementation services) required to underpin the Digital Asset Management (MediaBin) component of the ECM implementation as outlined below.

A.2 ECM Conceptual Overview

The ECM solution to be implemented by Alphawest will address the following functional areas, and integrate across and between components: Electronic Document & Records Management System (EDRMS) – defined as the ability to manage corporate records and documents (including email) in compliance with standards for record management. This includes the ability to capture, classify and create document workflows and mechanisms to achieve collaborative work practices Digital Asset Management (DAM) – defined as the ability to manage multiple format digital assets such as digital images, film & sound. This includes the ability to capture, classify and create workflows for such assets. This will be achieved using Interwoven‟s MediaBin software. Web Content Management (WCM) – defined as the ability to manage (create, approve, publish, amend, update and retire) content on the Memorial‟s website and intranet. ...

From: ECM digital storage, Request for Tender, Australian War Memorial, 12 January 2009

Labels: , , ,

Tuesday, November 11, 2008

Recordkeeping for government web information

Archives New Zealand have issued a Request for Proposal for "Development of Web Information Continuity Guide". There is a four page description of the work, available for download from the NZ Government website.
Information produced and maintained on the web as part of public sector business is covered by the Public Records Act 2005. This includes information on public websites, intranets, shared workspaces, wikis, blogs and other types of sites, as well as information in the administrative systems used to run these sites.

Archives New Zealand is receiving increasing requests for advice on recordkeeping for web information. Current guidance contained in the Continuum Recordkeeping Resource Kit was largely developed in 2003 and needs to be updated and expanded to provide more useful support to public sector agencies on strategies and tactics for current web information management that will support the aims of the Public Records Act.

Archives New Zealand is looking for a contractor to undertake the project over the period to 31 June 2009:

Interested individuals or consultancies are invited to submit an expression of interest along with a proposal outlining how you would approach the work and details of relevant experience by Friday the 21st November 2008. ...

From: Development of Web Information Continuity Guide, Archives New Zealand, 21/11/08

Labels: , , , ,

Tuesday, October 21, 2008

Federal Court IT Guidelines Delayed until 2009

The Federal Court of Australia's "Guidelines for the Use of Information Technology in Litigation in Any Civil Matter" were due to be revised by 1 July 2008, but have now been delayed until 2009:
In 2007 the Federal Court commenced a comprehensive review of Practice Note No 17 with the assistance of a consultant, Ms Jo Sherman.

Following extensive consultations with litigants, legal practitioners and others, a draft Practice Note and related materials were finalised by Ms Sherman and referred to the Court's National Practice Committee in mid 2008.

These draft documents are now being reviewed by the Court in light of recent case management initiatives (including the legislative reforms in this area proposed by the federal Attorney-General) and further comments provided by litigants, legal practitioners and others with an interest in the use of technology in legal proceedings.

It is expected that a number of changes will be made to the documents, and that the final versions will be formally released in early 2009....

From: Review of Practice Note No 17 - Guidelines for the Use of Information Technology in Litigation in Any Civil Matter, Federal Court of Australia Practice News No. 59, October 2008

Labels: , , , ,

Wednesday, October 08, 2008

Linear Reading Doesn't Scale

 The Dumbest Generation: How the Digital Age Stupefies Young Americans and Jeopardizes Our Future (Or, Don't Trust Anyone Under 30) by Mark Bauerlein In "Screen no match for the page in education" (Higher Education Supplement, The Australian, 8 October 2008) Mark Bauerlein argues that reading on a computer screen encourages quick skimming and therefore is not suitable for long complex material needed for education. However, reading quickly is not a lesser form of reading. This is a difficult skill needed to solve the problems of the 21st Century.

In my courses on web design and e-document management at the ANU I cite some of the same sources Mr. Bauerlein uses, but reach the opposite conclusion. Jakob Nielsen and Donald A. Norman are sources of inspiration for designing computer based information systems, rather than reasons not to use such systems. Information presented on a screen needs to be designed differently to a printed page.

Carving on stone was the form of communication used for important messages of long term value for much of written history. If Mr. Bauerlein's logic was followed, we should not be using books, as they were introduced for material designed to be read quickly and for short term disposable information. Typefaces such as Times New Roman, evolved from those used for carving on stone, for quicker reading. About the only institution to continue to develop rocks as a means of communcation in the last thousand years is Parliament House Canberra, for which Garry Emery designed a new typeface suitable for carving in stone.

Taking the argument back further, universities should not use books, or printed material at all, as these make the students lazy. We should return to the approach to learning which applied for most of human history, where the teacher recited the text and the students memorised it.

To suggest banning the book in universities is taking an argument to ridiculous lengths, as is proposing to not make use of computers. Computers are a useful way to transmit some forms of information, just as books are for other forms of information. Computers have some strengths and limitations, as does ink on paper.

Universities and academics needs to be able to work efficiently and react quickly, as well as think deeply. This is a necessary aspect of the world they are in. Computer systems offer some ways to make their processes more efficient. What they need to do is critically assess where computers can be used best.

As an example I set students assignments where they are required to carry out analysis of documents thousands of pages long. There is not time to read these documents, just as there is not in the real world the students are being prepared for. There is no option of taking years to read all the documents, as by the time you finished, there would be thousands more documents to read. The students need to learn to use computer based search tools to scan through the text to find relevant sections and then quickly form an assessment by skim reading what is important. They then need to concentrate their attention on the small important parts of the written work.

The web was invented for scholarly communication of science and my collogues are now inviting new technologies which will change the nature of scholarly communication further. Due to the backgrounds of those designing the tools, they will be biased towards scientific communication and may not suit other disciplines in the humanities. Some in the humanities have been engaged in a dialogue with the computer scientists as to what is needed for scholarly communication. If Mark Bauerlein has helpful suggestions, then he should make them. Otherwise he should not be surprised when the system implemented does not meet his requirements.

With more than a little irony, "Screen no match for the page in education"is available for reading online, as is Mr. Bauerlei's book: "The Dumbest Generation: How the Digital Age Stupefies Young Americans and Jeopardizes Our Future (Or, Don't Trust Anyone Under 30)".

A quick scan shows the The Dumbest Generation has 72 references to the Web and 29 to the Internet, but only one to Berners-Lee, inventor of the web. The single reference to the origin of the web suggests it was "hacked together ... as a way for scientists to share research". But if you read the original proposal for the web, ("Information Management: A Proposal, Tim Berners-Lee, CERN, March 1989, May 1990) rather than making the mistake Mr. Bauerlei did of just skimming the Wikipedia entry, you will see this was a carefully thought out proposal, not a hack.

Labels: ,