Thursday, April 15, 2010

Turning text into information

Alex Krumpholz is talking about how to apply insights from the way web search engines work to the analysis of scientific papers. Current web search engines were derived from previous work on text search systems. It is interesting to see the web search techniques now being applied to text. One example is that the anchor text is used by search engines; that is the text highlighted in a web link on a web page is assumed to describe the document linked. The equivalent for a research paper is the text near a citation. One interesting part of this is that in essence the algorithms are creating useful information from what is just text.

Structural aspects of medical literature retrieval

Alex Krumpholz (SoCS CECS)

CS HDR MONITORING Info & Human Centred Computing Research Group

DATE: 2010-04-15
TIME: 13:30:00 - 14:00:00
LOCATION: Ian Ross Seminar Room

This work discusses the retrieval of medical publications in a clinical setting. It aims to help busy doctors finding literature that are likely to be relevant in the current patient's case. IR related aspects of such a program are investigated.

Labels: , ,

Thursday, April 01, 2010

Mobile Web at CeBIT Sydney

I will be speaking on "Optimising Sites for Mobile Devices and Search Engines" at the WebForward Conference, 26 May 2010. Suggestions as to what to say and examples to use would be welcome. This is part of CeBIT Australia in Sydney. I may also attend the eGovernment Forum: Delivering open government on 25 May 2010.

Optimising Sites for Mobile Devices and Search Engines

by: Tom Worthington FACS HLM, Adjunct Senior Lecturer, Australian National University

Some simple tools and techniques can be used to allow any web site to be usable on a mobile phone and improve its rating with web search engines. These techniques can also help make your web site easier to understand for people using an ordinary web browser and particularly for use by older people.

  • Web Content Accessibility Guidelines
  • Validation Tests
  • W3C mobileOK Checker
  • TAW Web Accessibility Test

A set of tools and techniques have been developed to make the content of web pages more accessible. The best known of these are the W3C Web Content Accessibility Guidelines, with automated testing in the TAW Accessibility Tool. The use of such techniques is required by Australian anti-discrimination legislation, including by schools, TAFEs and universities in delivering education online. What is not well understood is that while these techniques are mandated for access by people with a disability, they can also be used to help with slow Internet connections, for limited devices such as mobile phones and for people with limited literacy.

There are additional guidelines and tools to help develop content for mobile devices, such as the W3C Mobile Web Best Practices and . Previously this required development of a second version of the content, specifically for mobile devices. However, as smart phones become more affordable, with larger screens and better software, it is possible to author the same content for education and service delivery to both desktop and mobile devices.

Providing accessible content and to mobile devices requires the web designer and the content author to make difficult decisions. Compromises must be made over what can be delivered and in what form. This can help make better content and better learning, by eliminating material which is entertaining, but not educational.


Tom Worthington is an IT consultant and an Adjunct Senior Lecturer at the Australian National University, where he teaches the design of mobile web sites, e-commerce and green ICT. In 1999 he was elected a Fellow of the Australian Computer Society for his contribution to the development of public Internet policy.

Tom was an expert witness in the Human Rights Commission for the Sydney Olympics web case and was invited to Beging to adivse on the design od the web site for the 2008 Olympics. He is a past president, Fellow and Honorary Life Member of the Australian Computer Society, a voting member of the Association for Computing Machinery and a member of the Institute of Electrical and Electronics Engineers.
This stuff from my talk looks relevant. I could have to change the examples from government to business:

Run a business on your phone, or a war

* Technology which was in Australian and US DoD is now in your phone
* Use tools (W3C Mobile OK)

Exercise KANGAROO 95 took place in an area of over 4 million km square, across the Top End of Australia from July to the end of August 1995 and involved over 17,000 Australian Defence Force troops, and visiting units from the USA, Malaysia, Singapore, Papua New Guinea, the UK and Indonesia.

Reports and photographs were transmitted from the exercise area using stand-alone portable satellite communications terminals, capable of 64kbps.

As manager of the Defence home page, I received the reports at Defence headquarters in Canberra and up-loaded, them to a publicly accessible Internet server at the Australian Defence Force Academy.

For the first week of the exercise I was officially on holiday, but maintained the K95 home page remotely using a pocket 2400 pbs modem and lap top PC from Mallacoota, Victoria.

In later exercises, such as Tandem Thrust 97 made more use of the Internet for the operations. However, the level of technology used in these exercises is similar to that now available in a 3G smart phone for about $1,000.

E-mail, the web and instant messaging can be used for business and government from a smart phone with a little forethought. Check your business web pages are mobile compatible. Make sure you put the important business information in a simple, clear, easy to read format. Don't use software and technology you don't really need.

Web 2.0 Thinking Needed

* Early simple web pages are compatible with mobile phones
* Later web design lost mobile compatibility
* New CSS features allow for desktop/mobile compatibility
* Need web 2.0 thinking by organisations

The processing power and network bandwidth of the laptop and desktop computers used for exercise KANGAROO 95 in 1995 is comparable with what is available from 3G smart phones in 2009. However, the use of these devices is being held up by poor web design.

Web pages design in 1995 were text rich, with small images and limited layouts. These designs are compatible with today's smart phones. Later more complex web page designs lost this compatibility, due to their complex designs and increased file sizes.

Also web based services need to be designed as services, not as marketing brochures. CSS features supported in modern web browsers allow for web 2.0 features, but organisations need to accept that their staff and their clients will want to be involved in decision making. Effort therefore need to be put into clear, detailed, information rich web sites.

Mobile Thinking

* What not to do: Big hard to read e-documents. (Consulting with Government online)
* What to do: accessible and mobile friendly (Online Consultation Guidelines)
* Don't intimidate with legalese: Government copyright notice
* Use open access: outdated, use Creative Commons
* Example: Public Sphere #2 – Government 2.0: Policy and Practice

Thinking about interaction with via a mobile device can help you think about how and what to communicate. The limitations in screen size and keyboard access force you to focus on the most important information first.

Some impediments to the use of the technology can be easily removed. As an example, in 2008 the Department of Finance and Deregulation issued a report on "Consulting with Government – online". This was a well reasoned exploration of the issues. However, the report was only issued as large, hard to read online PDF and RTF files.

The Online Consultation Guidelines from the Australian Government Information Management Office (AGIMO) are reasonably accessible and mobile friendly. The home page achieves a 80/100 score on the W3C mobileOK Checker. However, discussion of the document is still hampered by a Commonwealth Copyright Notice.

The Commonwealth Copyright notice used for web pages is essentially unchanged from the Department of Defence web copyright notice developed in 1995. The Australian Government should adopt the Australian Creative Commons Licence, or similar, to allow the free discussion of issues. Similarly, companies should check the licences they impose on information distributed. If you want your product details out there being discussed, don't make it hard.

Labels: , , ,

Thursday, March 04, 2010

Information Retrieval for Real-world Tasks

Paul Thomas (CSIRO and ANU) presented a seminar on "Information Retrieval for Real-world Tasks" at the ANU CSIT Seminar Room, N101 today. He argued that web search engines are historically related to document search systems which were sponsored by the US DARPA (TREC). The original task typically was to find a ranked list of documents relevant to a question, such as ones on smuggling plutonium out of the Soviet Union. There was an unintended pun in this as Paul talked about these being "atomic" documents. He argued that returning a list of documents does not suit real world tasks, such as choosing an espresso machine to buy. I was not convinced by the examples he gave which showed Google products listing web pages about espresso machines. The Google products search returns a list of espresso machines, with the assumption that the first in the list is best (Paul missed another pun here by not bringing up the details of the Atomic Coffee Machine). He then changed tacks to show examples of searches for biomedical data, which identified specific items in documents.

It seemed to me that there were two distinct topics Paul was confusing: information retrieval and task support. Information retrieval can be used to support some task, such as selecting a coffee machine. But retrieving information about coffee machines is not the same as purchasing a coffee machine. Real world search engines, such as Google, use heuristics to short cut this process. If people searching for coffee machines are really looking to buy one, then the search is modified to answer the question the user meant to ask, not what they actually asked. This process has proved lucrative for Google, as it results in people buying products and Google being paid for helping with that process.

Returning to the original example Paul used, of identifying plutonium smuggling, the real task is to detect and stop it, not just find documents. What the user of Web2 War systems, such US Army Knowledge Online (AKO) , US intelligence Intellipedia and the Tactical Ground Reporting System (TIGR) would ideally be directed to are not just historical documents, but live systems such as General Dynamics Mediaware's JPEG2000 for Wide Area Airborne Surveillance, with data from Predator UAVs. The system could then offer to issue the relevant tasking order to produce kinetic response, in real time.

ps: One of the side tracks this seminar took was the origin of the 10 documents goal of the TREC information retrieval tasks. One theory was this was as many as could be displayed on an old green screen. My thory was that if more than that many were dislayed, the user would have to take their socks off to count them. ;-)


Monday, February 02, 2009

Google Detecting influenza epidemics

Staff of Google in collaboration with the Centers for Disease Control and Prevention, have published a letter in Letter in the prestigious scientific journal Nature on "Detecting influenza epidemics using search engine query data". The idea is that people with the flu will do web searches about it, thus alerting authorities to an outbreak. This is a cleaver idea, but not the one I had in mind when I proposed using the web for an combating avian influenza epidemic.

Seasonal influenza epidemics are a major public health concern, causing tens of millions of respiratory illnesses and 250,000 to 500,000 deaths worldwide each year1. In addition to seasonal influenza, a new strain of influenza virus against which no previous immunity exists and that demonstrates human-to-human transmission could result in a pandemic with millions of fatalities2. Early detection of disease activity, when followed by a rapid response, can reduce the impact of both seasonal and pandemic influenza3, 4. One way to improve early detection is to monitor health-seeking behaviour in the form of queries to online search engines, which are submitted by millions of users around the world each day. Here we present a method of analysing large numbers of Google search queries to track influenza-like illness in a population. Because the relative frequency of certain queries is highly correlated with the percentage of physician visits in which a patient presents with influenza-like symptoms, we can accurately estimate the current level of weekly influenza activity in each region of the United States, with a reporting lag of about one day. This approach may make it possible to use search queries to detect influenza epidemics in areas with a large population of web search users. ...

From: Detecting influenza epidemics using search engine query data, Jeremy Ginsberg, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski & Larry Brilliant, Nature , doi:10.1038/nature07634; Received 14 August 2008; Accepted 13 November 2008; Published online 19 November 2008

Labels: , , ,

Wednesday, November 12, 2008

Detect Influenza outbreaks with web searches

Graph of five years of flu estimates for US Mid-Atlantic region compared with CDC dataGoogle have created a service to "Explore flu trends across the U.S.". The system tracks the use of search terms which indicate that people have influenza and plots this on a graph over time and a map of the USA. According to "Google Uses Searches to Track Flu’s Spread" (By MIGUEL HELFT, The New York Times, November 11, 200), a paper on this will be published in Nature.The idea of using web searches to detect natural phenomena is not a new one, with previous proposals to use internet traffic to detect earthquakes. The technique might be used as part of an ICT system to deal with an Avian Influenza Pandemic.

Each week, millions of users around the world search for online health information. As you might expect, there are more flu-related searches during flu season, more allergy-related searches during allergy season, and more sunburn-related searches during the summer. You can explore all of these phenomena using Google Trends. But can search query trends provide an accurate, reliable model of real-world phenomena?

We have found a close relationship between how many people search for flu-related topics and how many people actually have flu symptoms. Of course, not every person who searches for "flu" is actually sick, but a pattern emerges when all the flu-related search queries from each state and region are added together. We compared our query counts with data from a surveillance system managed by the U.S. Centers for Disease Control and Prevention (CDC) and discovered that some search queries tend to be popular exactly when flu season is happening. By counting how often we see these search queries, we can estimate how much flu is circulating in various regions of the United States. ...

From: How does this work?, Google Flu Trends, Google, 2008

Labels: , , , ,

Wednesday, October 15, 2008

Designing Australian English

A Workshop on Designing the Australian National Corpus is being held in Sydney 4-5 December 2008. The aim is to create a collection of Australian English text, inlcuding transcripts of spoken English, similar to the American National Corpus. This data is then used for researching the use of language. However, I have my doubts as to how realistic a "national" corpus is and how well the manual processes currently used to create such collections are. The automated tools used with web search engines would seem to be able to collect far larger volumes of material without the arbitrary "national" label.

As part of SummerFest 2008, to be held at UNSW in the week of December 1st—5th 2008, HCSNet (the ARC Research Network in Human Communication) is organising a workshop on Designing the Australian National Corpus

This workshop focuses on current developments and emerging possibilities in corpus construction and usage for researchers working in Human Communication Sciences. Its aim is to bring together researchers with expertise in data representation and corpus building, as well as corpus annotation and interrogation, in a single forum in order (1) to disseminate leading work on corpus construction and usage to the broader research community in Australia and thereby contribute to collective knowledge about data collection and representation, and (2) to work towards the design and construction of an Australian National Corpus that is innovative in exploiting the full potential of the interface between language and technologies. ...


Submissions for presentation at the workshop are sought. Topics of interest include but are not limited to:

  • corpus linguistics
  • corpus data
  • web-based corpora
  • linguistic and multimodal data representation
  • audio(visual) text transcription
  • language documentation
  • corpus interrogation
  • corpus annotation
  • corpus design and construction
  • language data ethics
  • corpus-based research

Labels: ,

Friday, October 03, 2008

Fake blogs make Blog search risky

IT World reported a comparison of rival Blog search engines ("Is Google Blog Search a Techmeme killer? No way.", by Ian Lamont, October 2, 2008), so I did some ego surfing to see who said what about me. But the search resulted in so many scam blogs, it makes blog searching a risky business and not very useful.

A search for "Tom Worthington", taking out the references to my own site and other well known people of the same name (in the USA there is an attorney and a fish seller who frequently feature in news web sites), left only 122 references. Some of these were by me, others were just relays of posting from my own blog, but some were thoughtful, if not always positive, comments on my work. Some are from people I know, but most from people I don't. Even from people I know I was not aware of the postings.

One worrying aspect is that about one quarter of the postings seem to be pieces of random text copied from web pages to produce fake blogs, mostly on These are then used to lure people to web sites packed with dubious advertising, re-directions and pop ups. One which seems popular with scams is Jim Byrne's summary of the web discrimination case "Bruce Maguire versus Sydney Organising Committee for the Olympic Games (SOCOG)", in which I get a mention. It is not clear why this would be used to promote sex web sites, but perhaps the document is very popular and so useful to attract web traffic.

The blog search engine designers need to improve their algorithms so reduce the risk of recommending fake blogs. The problem does not seem to occur with normal web searchers, so a solution should not be too difficult. The blog hosting sites, particularly, need to put in tests for such sites. This a serious problem which makes it so likely to end up at a dubious web site that it is not worth using a blog search at all, until it is fixed.

Labels: , ,

Tuesday, July 08, 2008

How to talk your way into an organization

Perhaps I need to put on a grim face, as people keep asking me for advice. I was sitting in the ANU science library and was asked if I knew how to contact ANU biology people about working on "interleukin". I had no idea what this was, but suggested the same approach I use to contact people to visit when traveling. This is to use a search engine, such as Google, to look for material written on the topic by, or about, people at the institution. Then you can compose a message to them, citing their work, expressing your interest and what you can contribute. This works a lot better than calling the switchboard at random and trying to introduce yourself. This approach works with companies and government agencies as well as for universities and research organizations.

Often the person you write to will not be the right one, or no longer be at the institution, but will pass your message on. You may have to try several people in the organization, but don't email everyone at once, or you will be seen as a crank. Students from South East Asian universities, for example, tend to bombard me requests for summer visits and PHD work. The problem is that when I find all my colleagues got the same generic request it goes from being flattering to an annoyance. But those requests which mention something I have written and explain how they would like to work on it are effective.

If you have the time another way to contact people is to have them find you. If you write about that they have done and put it on the web, they, or a colleague, will likely find it. You need to make what you write sound more than a puff piece or a request for a job.

Also remember that you need to have a clear and realistic plan of what you intend to do when you have made contact. If you are an undergraduate student, then it is unlikely that the head of a world leading research institute is going to listen to some bright idea you just dreamed up. But one of their assistants might suggest a program they have you could participate in.

Also remember the human search engine, which is the administrative assistant to the head of the organization. If you can make contact with this person they can help suggest who to take to and, more importantly, "suggest" (tell) staff to talk to you. I discovered this approach when trying to find people to visit in computing at Cambridge University. As a bonus I got lunch at high table (it did help I was the president of my national professional body at the time).

A variation on the ask someone approach is to use a social networking tool. Before a recent visit to Malaysia, Greece and Turkey, I used Linked-In to search for people interested in my topics at those places. The Linked-In system then identified people who knew people who knew me. The system will then send a request for an introduction through the chain of acquaintances, to put you in contact. This worked well and the intermediaries were very happy to facilitate, this also being a way to renew old acquaintances and make new ones.

Labels: ,

Monday, March 24, 2008

Lectures on-line

Looking for something from a previous lecture I had given I did a web search on the course code "COMP2410" and was surprised to find the audio of the lecture listed on FilesTube. I was worried they had taken an unauthorized copy of my lecture, but it turns out that is a search engine which specializes in file sharing sites. It found the file on the university web site and put in a link to it. This was one I was experimenting with providing synchronized audio and slides for.

The original notes for "The Web on Small Screen" are  online. The idea was then to synchronize this using SMIL. Unfortunately there were not enough SMIL players to make this worth doing.

Labels: , , ,

Tuesday, April 17, 2007

Information Retrieval at Microsoft Research Labs

Recommended: Nick Craswell is speaking on Information Retrieval (Web IR) in Canberra Wednesday (I visited Nick at Microsoft Research Labs Cambridge, on a bicycle tour of Europe):
The Australian National University

Nick Craswell Challenges in Web Information Retrieval (Web IR)
Nick Craswell (Microsoft Research Labs, Cambridge, UK)

DATE: 2007-04-18 TIME: 16:00:00 - 17:00:00 LOCATION: CSIRO Seminar Room S206 (come to reception on Level 2, CS&IT Bldg)

ABSTRACT: When building a Web search engine, we can benefit from core IR techniques, such as probabilistic ranking models and evaluation methods. But we also face problems that are not yet so well-studied in the field of IR. This talk explores several of these. For efficiency reasons, we need to crawl the web selectively. This raises an interesting query-independent ranking problem. We have large-scale logs of user behavior. I will present a novel approach for dealing with sparsity of this data. We may also have relevance judgments for a large number of queries, as in the new TREC "million query" track, which allows for large-scale parameter tuning experiments. Each of these problems lends itself to data-driven solutions. The talk should thus give a favour of the work that goes on in the area of commercial Web IR.

BIO: Nick is a PhD graduate from ANU Computer Science who worked in CSIRO's Enterprise Search group before joining Microsoft Research in Cambridge. Nick is now employed as a researcher in the team behind Microsoft's search engine. He is a coordinator of the TREC Enterprise Track and the INEX Entities Track. He is also a Senior Reviewer for the ACM SIGIR Conference and is author of many influential and highly cited papers in the Information Retrieval area.

Labels: , ,

Friday, March 30, 2007

Google Came to Canberra

On Thursday, Will Blott and Alan Noble from Google's Sydney office and Neetu Sabharwal from their ANU in Canberra:
"Google Australia is looking to forge relationships with key universities as they now have a dedicated 'on campus' focus in Australia. Google is keen to explore opportunities to partner that will add value to students' experience and help develop computer science engineers for Australia. ".
The overall message from the visit is that Google is looking for staff who can write useful computer programs. They are happy to provide support to researchers, to offer students the opportunity to work with Google people, but in the end they want people who can write useful computer programs, not just research papers. This was a refreshingly down to earth view.

One aspect I found interesting was Google's global nature. The company has a US West Coast base. This results in some slightly annoying cultural aspects of their promotional material making them a bit like a cross between the McDonalds hamburger chain and The Wiggles. But Google is developing labs around the world which are growing rapidly. While the staff are physically located in one lab, they work with those in others.

National research offices for global corporations can have their problems. When I visited Microsoft Research Labs in Cambridge (UK), there seemed to be a fear that they would be out researched by low cost PHDs at Microsoft Beijing. Google use their company culture to attempt to overcome this.

One interesting aspect of having a Google center in Australia is that students from the Asian region at Australian universities might have a better access to Google scholarships and jobs than they would at home. There is a much smaller pool of students in Australia to compete for attention, than at an Indian or Chinese university. Once in the Google door, they then have access to the Google center in the home country.

Google Work With the ANU

Before Will and Alan gave a seminar, there was a discussion of possible areas for cooperation. Three areas I thought worth looking at were:

* Digital Mapping for the Public Good: Mobile phones for bushfire mapping, and applications for a GPS open source smart phone.

Sentinel Interactive Fire Tracking Map DemonstrationBushfire mapping

One student evaluated what was needed for an emergency management web site.

One application is adaption of the Sentinel Fire Mapping System for mobile devices. An experimental alternative web interface is available.

* Broadband Applications for Non-Broadband Users: New web applications are tending to require more and continuous network access. This makes it more difficult for those still on slow dial up connections and for wireless users with slow intermittent connections. These could be people in developing nations, such as India and China, but also in regional parts of places like Australia. These might not sound like high value customers for a company to target, but many of the same techniques used to provide Internet applications to rich people with smart phones can also be used for slow dialup users.

Sahana home page on a mobile phoneAn example is to modify the Sahana open source disaster management system for a phone.

* Cultural Links: As I found when teaching web design to museum workers in Samoa, there is great interest and value in providing web access to cultural material. But this tends to result in relatively dull, academic web sites, separate from the lively commercial stuff. Creating lively web sites is hard work. It should be possible to enhance the culturally worthy stuff, using some automated techniques like those applied commercially.
Ten Canoes Study Guides
Two students undertook projects to provide a better web interface to Australian museum materials, including those which inspired the movie Ten Canoes.

One student now working out how to use this to provide more relevant links from the ACS Digital Library to services such as Google.

Google Apps

There was a little of a sales pitch in the visit, with Google saying how good their Google Apps Education Edition. I am not sure how many universities, or companies, would be convinced of this. While organizations may be willing to use free third party systems to allow people to interact remotely, they are reluctant to have these systems as part of their "mission critical" applications. They are even more reluctant to have their data stored on someone else's system at an indeterminate location in some other country under that country's laws.

A lot of this reluctance to use external providers is irrational. Shared and remote systems used to be an everyday part of computing. Google's system is likely to be more reliable than the average corporate system and there are benefits in having your data stored away from head office. In a recent case a hail storm closed several buildings in Canberra for days. The ANU campus was closed, but the computer systems kept working and people were able to work remotely. With something like Google Apps an organization would be able to keep working remotely (perhaps even via smart phones).

However, I have to admit that while I use Google's Blogger service to prepare my blog, I still get it to put the files on my own web server located in Australia. I like the comfort of my data on a system I am paying for in a location under the same laws. Google will be hampered in promoting Google Apps in Australia, as their data centers are located in other countries, and so mostly not subject to Australian law.

Google would have difficulty locating a data center in Australia, as there are limited international telecommunications links to Asia and the USA. Perhaps the ALP could dip into the Future Fund some more to pay for extra fibre optic links to the USA and Asia. Given the amount of traffic coming from Google, this may have a significant impact on Australian telecommunications.

Labels: , , , , ,

Thursday, March 22, 2007

Google Comes to Canberra

Next Thursday Will Blott and Alan Noble from Google's Sydney office are visiting the ANU in Canberra.

The first part of the visit sounds like a sales pitch: "Google Australia is looking to forge relationships with key universities as they now have a dedicated 'on campus' focus in Australia. Google is keen to explore opportunities to partner that will add value to students' experience and help develop computer science engineers for Australia. ".

The second part is a technical presentation on the development being done for Google in Sydney, including Google Maps.

While I have been aware of some involvement of search engine developers locally, it will be interesting to put faces to names. The Standford University lab where Google originated uses my web site to test new search technology. AT one stage I had to tell them to slow down the crawling of my site. Some people from ANU have gone to work at Google and Microsoft on search technology.

Relevant projects at ANU include ones on semantic web for cultural publishing, mobile phones for bushfire mapping, and applications for a GPS open source smart phone.

Ten Canoes Study GuidesSemantic Web for Cultural Publishing

Two students undertook projects to provide a better web interface to Australian museum materials, including those which inspired the movie Ten Canoes.

One student now working out how to use this to provide more relevant links from the ACS Digital Library to services such as Google.

Sentinel Interactive Fire Tracking Map DemonstrationBushfire mapping

One student evaluated what was needed for an emergency management web site.

One application is adaption of the Sentinel Fire Mapping System for mobile devices. An experimental alternative web interface is available.

Sahana home page on a mobile phoneAnother application is to modify the Sahana open source disaster management system for a phone.

Labels: , , , , ,

Tuesday, February 27, 2007

Secure Web Searching in Orgainsations

Recommended. CSIRO have spun off their search technology in the Funnelback product and ANU IT students working on searching with CSIRO have gone on to work for Google and Microsoft:


Secure Search inside the Enterprise

Peter Bailey, (The ICT Centre, CSIRO)

DATE: 2007-02-28
TIME: 16:00:00 - 17:00:00
LOCATION: CSIT Seminar Room, N101

Providing secure search in the presence of document level security (DLS) is so easy in theory that no one has written papers about it until now. In practice, it turns out that while implementing it is (mostly) easy, user expectations of the search experience get in the way of getting it right. A model of the main factors in secure search implementations is presented, together with an analysis of search performance in an experimental DLS environment. Various conclusions are drawn from the results and about the tradeoffs which can be made to optimise for the user's search experience. Note that we do not attempt to describe how a DLS system itself should be implemented - the search system typically must use whatever underlying security mechanisms exist.

Dr Peter Bailey is the leader of the Search and Delivery Project in the CSIRO ICT Centre

Labels: , ,