Monday, January 04, 2010

Phone number put Hex on mail message

Recently SpamAssassin warned me that an incoming message might be Spam. It turned out this was because the message mentioned a number of transportation web sites. Sydney Public Transport Information uses a phone number as its web address . This triggers rule URI_HEX ("URI hostname has long hexadecimal sequence"). Sydney Ferries is an .info domain name which triggers INFO_TLD ("Contains an URL in the INFO top-level domain"). It is not clear to me why these should be suspicious.

Thursday, August 27, 2009

Are a few big messages clogging email

I noticed that I had 51.6mbytes of space used up in hundreds of messages. But 35% of this was from just 28 messages which had large attachments. These were messages I did not really want to keep (and would have preferred not to get in the first place). The rest of the messages were very small. Ways to prevent such messages being received might provide large benefits for the capacity of systems.

As an example, if email attachments attachments were kept on the sender's server and only sent when requested (and in most cases never requested), that would reduce email traffic. Perhaps a useful email utility would check to see if a document a person was attaching to email was already available publicly on the web and offer to insert a link instead.

Saturday, August 08, 2009

Google Wave and Advertising

An interesting topic is how advertising will fit into Google Wave. It would appear that Google Adsense type advertisements would be a good fit for Wave. A Wave robot would be able to carry out an analysis of what was being discussed in the wave and insert appropriate advertisements. This could be done more interactively and at a much finer level of granularity than with an email message, web page or a blog. If well done the advertising should be helpful, rather than obtrusive, rather like the robot being another participant in the conversation.

Thursday, November 13, 2008

Economics of Spam

The paper "Spamalytics: An Empirical Analysis of Spam Marketing Conversion", details how researchers hacked into a spam network to measure its effectiveness. I was interviewed about it ("Spammers making a profit") on ABC Radio for the PM program. The researchers suggest that Spam is not as profitable as previously thought. My main concern with the research was over the ethics and legality of the research technique.

Ever wondered how the companies that send out junk emails make any money, when most people delete the emails without reading them? Well, a group of computer scientists in California has found that spammers are turning a profit, despite only getting one response for every 12.5-million emails they send.

From: Spammers making a profit, PM, ABC Radio, Wednesday, 5:10pm on Radio National and 6:10pm on ABC Local Radio, 12 November, 2008 (audio also available)

The researchers hacked into the "Storm" botnet network and monitored how many messages were sent. They then set up two fake e-commerce web sites to see how many people would click through the spam ads to buy the products. They found only one in 12.5 million clicked through. Based on this they suggested Spam is not very profitable. It seems a reasonable conclusion and I suggested in the radio interview that the people doing this could probably earn more from the effort involved via legitimate e-commerce.

There are numerous research papers on the economics of Spam. The wall Street Journal covered this in 2002: For Bulk E-Mailer, Pestering Millions Offers Path to Profit. That spam may not be as profitable as previously thought is interesting, but does not necessarily lessen its appeal to criminals.

However, my main concern was the methodology of the research. It is ethically and legally questionable for the researchers to hack into a spam network. Like any citizen, when a researcher finds someone doing something illegal, they have a responsibility to report that to the appropriate authorities so it can be investigated and those involved prosecuted. In this case the researchers do not appear to have done that and instead monitored the network and even set up their own e-commerce store to exploit it.

The researchers are from Dept. of Computer Science and Engineering, Berkeley and University of California, San Diego. Those institutions have ethical guidelines for research which the researchers should have consulted before proceeding.

In the ethics section of the paper, the authors state: " First, our instrumented proxy bots do not create any new harm" and "Second, our proxies are passive actors and do not themselves engage in any behaviour that is intrinsically objectionable; they do not send spam e-mail, they do not compromise hosts, nor do they even contact worker bots asynchronously. " and "Finally, where we do modify C&C messages in transit, these actions themselves strictly reduce harm. Users who click on spam altered by these changes will be directed to one of our innocuous doppelganger Web sites.".

However, the authors do not address the issue of if they were taking part in a criminal activity or if they should have reported the criminal activities to the appropriate authorities. It seems a flawed argument for the researchers to say their activities were no more harmful than those being observed.
The “conversion rate” of spam — the probability that an unsolicited e-mail will ultimately elicit a “sale” — underlies the entire spam value proposition. However, our understanding of this critical behavior is quite limited, and the literature lacks any quantitative study concerning its true value. In this paper we present a methodology for measuring the conversion rate of spam. Using a parasitic infiltration of an existing botnet’s infrastructure, we analyze two spam campaigns: one designed to propagate a malware Trojan, the other marketing on-line pharmaceuticals. For nearly a half billion spam e-mails we identify the number that are successfully delivered, the number that pass through popular anti-spam filters, the number that elicit user visits to the advertised sites, and the number of “sales” and “infections” produced.

Categories and Subject Descriptors: K.4.1 [Public Policy Issues]: ABUSE AND CRIME INVOLVING COMPUTERS
General Terms: Measurement, Security, Economics

From: Spamalytics: An Empirical Analysis of Spam Marketing Conversion, Chris Kanich, Christian Kreibich, Kirill Levchenko, Brandon Enright, Geoffrey M. Voelker, Vern Paxson, Stefan Savage, CCS'08 Conference, ACM, October 2008

Thursday, July 17, 2008

Protecting email servers on the Internet

Michael StillMichael Still will give a free seminar about his research on how to protect email servers on the Internet from denial of service attacks, 2008-07-31 at the ANU in Canberra:

Measuring deployment of mail servers on the Internet
Michael Still (DCS, ANU)

DATE: 2008-07-31
TIME: 16:00:00 - 17:00:00
LOCATION: CSIT Seminar Room, N101

There are millions of email servers connected to the Internet. I have an interest in developing a survey of these servers to determine the current comparative popularity of the various SMTP implementations in existence. My specific interest is in developing Denial of Service (DoS) attack protections for such servers, where popularity data for SMTP implementations guides the testing regime for my proposed DoS defenses. This seminar will cover the survey methodology I am currently using, as well as early results.

Michael Still is a PhD student in DCS at the ANU, as well as being employed as an engineer at Google in Silicon Valley.

Tuesday, April 01, 2008

Greenhouse gas from paper versus electronic mail

The US Postal Service is studying the environmental impact of mail delivery. But they claim that advertising mail reduces harmful emissions, by informing consumers and so reducing shopping trips.

By my own back of the (recycled) envelope calculations, an airmail letter from from Canberra to Brisbane produces about 136 g of CO2 equivalent and this is one hundred times as much as email.

Here is the calculation:

CO2 from Paper Mail

A sheet of A4 paper weighs about 5 g.

An envelope and stamp will weight about 7 g.

This gives a total of 12 g for a letter.

For a flight of around 2500km, .1260 kg per km of CO2 is produced to transport a passenger.

The standard weight for a passenger is 77 kg.

So that works out to about 1.64 g of CO2 per km per kg of cargo, or 0.14 g per letter per km.

A letter which went 1,000 km (about the distance from Canberra to Brisbane) would produce about 136 g of CO2 equivalent.

CO2 From Email
My estimate is that a 20 kbyte e-mail message (one A4 page equivalent) produces one gram of CO2 per year <>.

So email would be much better, as long as you did not keep the message online too long.

However, if the letter was only being transported a few tens of km within the same city by road, then the CO2 emissions for the paper letter would drop to under one gram. This would then might be more than an email message kept online a long time.

Combined electronic and paper mail delivery

Obviously it is possible to reduce the impact of long distance paper mail by transporting it most of the way electronically and printing it near its destination. About twenty years ago I helped interface a system at the Department of Education to Australia Posts' system to do this. Setting it up was complex, but it worked reasonably well. This should now be easy to do with standardized Internet based protocols.

Australia Post have a service called eLetter, which seems to be for printing and delivery of mail. Unfortunately Australia Post seems have a very poor quality web site, making it difficult to find out about the service.

Also they seem to be concentrating on helping send more junk mail, with services such as Easy Post.

Large mail users, such as the federal government could send correspondence to the nearest capital city electronically for local delivery. Apart from saving greenhouse gases, this would save money. Setting up a system for the whole of the Australia Government would be no harder than the system I helped build for one agency twenty years ago.

