Metadata and E-document Management - Advertising on the Web


This is a brief overview of how one approach to providing advertising for web pages uses metadata. This allows small web content providers to earn revenue from advertising on their web sites.

These are notes for a case study on e-commerce and electronic publishing for ANU course "Information Technology in Electronic Commerce" (COMP3410/COMP6341). This section of the course is prepared and presented by Tom Worthington FACS HLM, a Visiting Fellow in the Department of Computer Science at the Australian National University (and Director Tomw Communications Pty Ltd).


One of the common ways to support the cost of publishing is through advertising. While the web lowers the cost of publishing and allows even small organizations to become e-publishers, the cost of supporting advertising may be prohibitive. One way around this is to aggregate the advertising through a broker, such as Google provides with their AdWords and AdSense services.

The advertising broker accepts advertising and then places it with suitable web sites. The broker is paid by the advertiser and then shares some of the payment with the web site publisher. This process can be automated, with the advertiser providing describing the type of web sites they would like to advertise on and this then being matched with suitable sites (using metadata about the geographic region and suitable keywords). The price of the advertisement can be determined by auctioning the web sites, the price being determined by the most popular keywords. In this way, in effect, the metadata about the web pages and the advertisements become commodities and proxies for the advertisements and web page contents. Google use this process and will be used for this example.

Matching advertisers to publishers

Advertisers select the region they want to advertise in and the language. They then create a headline, brief text description (up to 95 characters total) and the URI of the advertisers web site. The aim of the advertiser is to have readers click on the URI and be forwarded to the advertiser's web site.


The advertiser specifies keywords (or phrases). These will be matched with keywords on web pages to place the advertisement on. The assumption is that readers who are interested in web pages featuring those keywords will be interested in the advertisements featuring them.

Cost per "click"

Unlike print advertising, the advertiser is charged not for the advertisement appearing, but for the reader "clicking" on the link in the advertisement and (presumably) have their web browser display the advertiser's web page.

Inserting Advertisements on Web Pages

Unlike print advertising, the web site provider (publisher) does not directly select the individual advertisements for display. The words in their web site are analyzed by the broker's system to determine the keywords for the page. These keywords are then matched with the advertiser's keywords and advertisements selected (along with region and language).

The publisher does have control over on which web pages and where they appear. This is done by inserting a small segment of code (such as Javascript) into the web page's HTML at the point the advertisement is to appear. This requests the advertisements from the broker's system.

The publisher is not paid for the advertisements appearing, but for those which have been clicked on. This provides an incentive for the advertisements to be prominently placed on pages featuring popular keywords.

Example: Indian Pacific

As an example, the author recently added two advertisement blocks to the web travelogue "Indian Pacific: Sydney to Perth by Train". For the larger advertisement on the top right of the page, the broker's system inserted advertisements for "The Indian Pacific Train", "Perth save up to 65%", "Indian Pacific" and "Blue Mountains Web". Clearly these are being matched on the phrases "Indian Pacific" and "Blue Mountains".

Screen shot showing a large advertisement, from Indian Pacific: Sydney to Perth by Train, Tom Worthington, 1995, URL:

For the small block at the bottom, the advertisements were for "Indian Pacific", "Mudgee New South Wales" and "Bus Tour Blue Mountains". It should be noted that the words "Mudgee" and "Bus Tour" do not occur in the web page and have been provided by the advertiser.

The larger block at the top of the page achieved a click through rate of 8.5%. That is for every 1000 people who viewed the web page, 85 clicked on one of the links. The smaller block at the bottom of the page only achieved 2.8% in the same period.

Semantic Web: Making sense and cents of the web

It should be noted that matching advertisements to web pages and, indirectly with the interests of the reader, is an exercise in semantics. The broker's system tries to distill the essence of the meaning of a web page with some keywords. The advertising system, as an e-commerce process, depends on the correctness of the matching. Readers will not respond to advertisements unrelated to their interests and advertisers will not pay the broker for no clicks. As web pages gain more machine readable meaning, via approaches such as the Semantic Web, this process can be refined and further automated.

The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming.

"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001

From: Semantic Web, W3C, 2005, URL:

Small web publishers, such as the author, are likely to be entering content and laying it out manually. The advertising broker needs to provide a web based interface for the advertiser and the publishers to create their advertisements and place them. Large web sites may have data entered and positioned automatically and may be dynamically created from a database. These require the creation and placement to be more automated and can use Web Services.

Resource Description Framework

An example of use of the semantic web is the Creative Commons Licence. This contains Dublin Core metadata using the Resource Description Framework (RDF), specifying the creator of the page and what can be done with it:

<rdf:RDF xmlns="" ...
<dc:title>Coming of Information Age in Samoa ...
<dc:date>2005 ...
<License rdf:about="" ...
<permits rdf:resource="" ...

Code showing Creative Commons Licence, from Coming of Information Age in Samoa: A web workshop for Pacific museums staff, Tom Worthington, 2005, URL: