Spammers are using ISO encoding, in Subject and From, to evade spam filters
Most people who see an obvious spam email message, based on the "Subject" or "From" fields, just delete it on sight. I often go one step further and examine the normally hidden source code. This gives me an insight into some tricks employed by criminals to get their spam messages past the spam filters used by many ISPs and email providers. This helps me to develop new, or modify existing spam filters that I publish for MailWasher Pro users.
I have seen many changes in spam composition tactics over the years I have spent fighting spammers. One trick that used to be prevalent a few years ago is making a big comeback right now. That is the use of "ISO Encoding" for the Subject, From and sometimes other fields in the normally hidden email headers. This type of encoding has legitimate uses and senders (like Yahoo), so don't rush to premature conclusions and block everything containing an ISO subject.
What is ISO encoding and why do spammers employ it as an evasion tactic?
ISO is the World-wide International Organization for Standards that establishes common standards for all manner of interoperable systems that are used around the World, to allow them to interact with one another. This includes the standards of email systems and the coding used in email messages. One of the email standards established and defined by ISO is the email header "Codepage" encoding system. This system is used to tell an email client (program/reader) what language the message is written in and how to render the contents when it is opened.
The default Codepage system used in English language email messages is known as ISO-8859-1. It corresponds to the "Latin-1" and "Windows 1251" character sets. If an email is composed without any declaration of Codepage, and is sent through a mail servers assigned to Western languages, it is automatically displayed in English, using the default display of the user's computer.
Since email composed in one language locality is frequently sent to recipients with a different language and alphabet, senders can specify that they are asking those messages to be displayed in the language and alphabet of the recipients. This is where the use of ISO encoding in the email headers comes into play. It is used frequently by International companies in email blasts to numerous recipients around the Globe.
Spam email also benefits from ISO encoding. Here's how:
Many free email systems, like Microsoft's Hotmail, are plagued with "bots" used by spammers to break security challenges (e.g. CAPTCHA), open new free mail accounts using bogus information, then send out spam blasts to the recipients listed in spam databases. They spammers may get only one or two successful spam runs before they trigger alarms at the email provider and the account gets shut down. But, to ensure that the spam actually gets out at all, they have to make sure it isn't blocked by the outgoing email server's spam detection filters. In English speaking Countries, the default spam filters are written in English and match English language words and phrases.
Spammers using these free email providers have learned that one of the easiest ways to avoid having spam messages blocked by outgoing filters is to not use English words and phrases in the From, Reply, or Subject fields. Instead, they are resorting to the use of ISO encoding tricks. The outgoing spam filters look at the hidden headers and well as a snippet of body text, looking for significant matches. Many incoming mail servers also use the same spam detection systems. By using ISO encoding in the From and Subject, one can sneak spam words past many common spam filters.
Once these messages arrive in recipients' inboxes, their email program ("client"), or web-mail browser, translates the ISO codes into the language specified in the Codepage declaration. In the case of ISO-8859-1, the displayed words will be in standard American English. The recipient does not see any of the coding tricks, just the decoded letters and words. The message slipped past anti-spam filters at the sending end and at the receiving email server, both of which look at the headers first and then so many lines of the body text.
Most of the ISO spam messages also use ISO or other encoding tricks, gibberish (salad words) and non-displaying text hidden inside html style tags, in the beginning of the body, moving the actual spam words and links way down, past where most commercial spam filters give up.
If you want to learn more about the use of ISO encoding, as it pertains to spam filters and email, read my extended content.
What does a hidden ISO-8859-1 encoded From or Subject look like?
From: "=?ISO-8859-1?Q?=4D=65=64=73=34=4C=65=73=73?="
Subject: =?ISO-8859-1?Q?=47=45=54=2D=56=31=41=47=52=41=2D=43=49=41=4C=49=53=2D=4C=45=56=49=54=52=41=2D=38=4F=25=2D=30=46=46?=
Translated by your email client, these codes become recognizable words about Meds or online pharmacies, and include the names of popular anti-ED prescription drugs, with registered trademarks being violated by the purveyors of illicit fake pharmacies selling counterfeit pharmaceuticals. See my Spam Issues category of this blog for more articles about various fake pharmacies and the criminals running them.
To see this one must know how to display the headers or source code of the incoming message. If you use a real email "client" - like Outlook Express (deprecated), Windows Live Mail, you can easily display the source code of any message as it sits unopened in your inbox. Just right-click on the message, then move your pointer all the way down to "Properties" and left click to open a box with the properties. Click on the "Details" tab to see the hidden headers, which are shown in a much too small text field. Click on the button labeled "Message Source" and you will have a large, expandable window open, with not just the headers, but the entire source code of everything in the email message.
If you use your web browser to read email, read the article I wrote in 2006 about "How to display the headers of spam/scam emails..." - or check your web-mail "options" or "preferences" links to see how you can display the "full" or "complete" "incoming headers."
Spam fighters who belong to SpamCop use this feature to display, then copy the entire spam message source code, then paste it into a SpamCop report and submit it. I do this with every spam message that makes it past my auto-delete spam filters. Since I also use MailWasher Pro, which screens incoming messages before I download them to my email client, I am able to submit spam directly from the program interface, to SpamCop. I have to acknowledge an email reply and click on its link to actually file the report, as per SpamCop's requirements. However, it saves the time that would be wasted opening the source code, copying it and then logging into SpamCop and pasting it into the report field.
Since MailWasher Pro can read the hidden headers, it is trivial to write a spam filter that detects the use of ISO-8859-1 (as shown above) encoding in the headers and label any message having that encoding as possible spam. When you look at the Subject and/or From columns, in the MailWasher interface, you will see the translated characters and words, as intended by the sender. I simply whitelist any known legitimate senders who use this encoding and automatically delete everything else. It works for me!
Spammers don't just use ISO encoding to display English words. Lately, for reasons as yet unknown, I have been receiving Spanish and French language spam. These messages often make use of a different ISO Codepage for render the accented characters used in these foreign (to me) languages. If some spammer thinks that the recipient reads Oriental he will use one of the Oriental Codepages. You might as well send me Canaanite-Phoenician hieroglyphics as any Oriental character sets!
Again, I was able to write a MailWasher filter that detects accented characters and words common the French and Spanish, but not English. You can read about my MailWasher Spam Filters here. They are free to download, but, donations are very much appreciated!
If you like this article please share it.
The content on this blog may be reprinted provided you do not modify the content and that you give credit to Wizcrafts and provide a link back to the blog home page, or individual blog articles you wish to reprint. Commercial use, or derivative work requires written permission from the author.