Not so dumb after all

In a recent post, I described some strange image spam that has been arriving recently. The images appear to be nearly blank, consisting simply of a white background marked by a few random lines. I assumed that this was just another spammer screw-up, and wondered how long it would be before the spammer noticed and the tide of garbage images stopped. The answer is that we may be seeing these things for a while yet. They aren't an accident at all.

An anonymous reader of this blog wrote to point out that the images are actually animated GIFs. The first frame of the GIF consists of the meaningless image described; the second frame contains a standard stock pitch. Which you see will depend on which email client you use.

Spammers are not at all averse to targeting a subset of mail clients. In the early days, there was a lot of spam that only made sense to recipients who used AOL or the AOL mailer. Later on my mantra when puzzling out apparently illegible spam became "But how does it look in Outlook?". The answer was often that a spam that looked ugly or made no sense to me when viewed in my mail client looked just the way the spammer wanted it when viewed in Microsoft Outlook, the mailer used by the majority of the spammer's core market. Now the question should probably be "But how does it look in webmail?".

And the answer, in the case of these spams, is almost certainly "Just fine". Mail clients may not all display animated GIFs, but a web browser certainly will. And I wouldn't be surprised to learn that the double-text spams, which failed to display properly in regular mail clients, would render as the spammer intended in certain webmail systems. Perhaps not all of them, however: I notice that they have now abandoned that particular tactic.

The goal, as ever, is to get the spam past the spam filters and under the eyes of the user. The animated GIF spams, like the multi-image 'puzzle' spams that may well have originated from the same spammer, are designed to defeat spam filters that use OCR to turn the embedded image into plaintext (the random lines are included to give each image a unique 'signature' to prevent simple checksum or pattern matching).

It's questionable how useful this is, though. It won't take long for the developers of OCR-based filters to update their software to handle multi-frame GIFs. More to the point, it doesn't actually seem to improve the deliverability of the spam. My own filters are not that good (ironically, despite the time I spend studying and commenting on spam, my home-made filtering setup probably lets through far more spam than a professionally-maintained system like Hotmail's or Yahoo!'s), but these latest spams have all landed squarely in the bit bucket every time. For most purposes, the question of what is being advertised is entirely academic: the filters just need to decide "Is it spam or not?" There's no real need to OCR the embedded image to find out what the spam's pushing if there are other features of the message - such as failing DNSBL tests - that allow you to recognize it as spam.

It would be interesting to know how many of these spams do get through to the webmail clients used by the spammer's core market. My own filters may be imperfect, but because they serve only a limited number of users I can be a lot more aggressive about embedded images and Asian senders than Hotmail or GMail can. Tests that work well on the kind of mail that I and my customers receive might generate far too many false positives when applied to the mailstream of a big webmail provider. My guess is that the spammer has taken this into consideration. Just as the spams are crafted to display reliably in the clients used by the bulk of their target market - webmail clients - they have probably carried out the necessary experiments to make sure that their recipes will get them through enough of the time to make it worthwhile.

These new spams are probably not going to go away for a while, although the spammer will eventually switch tactics (I've noticed that the multi-image 'puzzle' spams are getting rarer, and the sender has reverted to single-image spams again) and try something new. My guess would be that the spammer's next gift to the OCR filters will be a first frame full of hashbuster text, although they may already have considered that and rejected it. What's clear, however, is that the stakes are high enough for the spammers to devise some really quite ingenious approaches to getting round filters and that what at first appears to be cluelessness may in fact be anything but.

My thanks to the anonymous tipster who set me straight on this point.

Tags: , , ,


weblognewsstocksstatstoolsnoteslinksmisc