Use RegEx to filter spam from your mail server - part 1
July 4, 2017
For years now, I have been writing and publishing spam filters for MailWasher Pro, which is a desktop POP3 and IMAP email filtering program. My filters are very effective at flagging or deleting spam, scams and malware links or attachments. That's great if you use MailWasher Pro. But, if you don't use a spam filtering program and are using your own hosted domain for email, which you read in a desktop email client (not browser based Webmail or Gmail), my regular expressions email filters may protect you from spam threats.
First, the term "regular expressions" is usually abbreviated as: "RegEx" - which is how I will refer to them from henceforth in this article. They are characters and formatting that can match all manner of words, numbers, HTML codes, and even empty typed spaces and line feeds. While I usually write RegEx codes by hand, I always test them in a program called Regex Match Tracer. If you want to play around in RegEx land, get a copy of Match Tracer to find and fix errors before you upload them.
Even though I am a long time MailWasher Pro registered user and supporter, there are just some types and sources of email that I don't even care to see in MailWasher's Recycle Bin (you can restore accidentally or misinterpreted email that you deleted from the built-in Recycle Bin). Some are repeat spam senders, or Chinese or Russian senders who mistakenly think I give a crap about their counterfeit pills or dating scams. Still others are sent from botnets I have already identified and blocked by certain lines in their email headers.
Get it? Got it? Good! Let's move on to some examples of my own RegEx filters that I use on my mail server for my Bluehost web hosting account.
When you want to block any email from staying available for downloading from your hosted mail server, you have to either send it to the bit bucket (aka: Discard, Send to Dev Null), or Fail (with or without a message to the sending domain). In lieu of deleting a message, you have the option of redirecting it to another email account, which can even be a Gmail account. So, when I create a new mail filter under my hosting cpanel, I have to decide what to do with those messages matching my criteria.
Sometimes, the safest thing to do at first is to redirect those filtered messages to an account you created only to receive "junkmail." Then you will know if the filter is working when certain email messages begin appearing in that account, shortly after you created the filter to catch them. After a few days of testing to ensure that only the email you targeted is getting redirected, change the action to delete, or fail.
Let's look at the various fields that are available for "email filtering" on a cpanel of an Apache server hosted web account. I'll use a numbered list to keep the steps in order. I'm using the cpanel options available for shared hosting accounts on Bluehost. Yours may vary if you use a different web hosting company, or run a dedicated server.
- Log into your domain's "hosting login" account; not the webmail login.
- If you don't automatically arrive at cpanel, click on the link or tab labeled "cpanel."
- Mouse down to the "email" section, then click on the icon labeled "Account Filtering."
- When the Account Filtering page opens, click on "email filters." This displays a list of all email filters you have created, if any, under the heading: Email Filters."
- At the bottom of the list of filters (if any) find the green button labeled "new filter" and press it.
- The following items will appear, starting with the text: "Create email message filters that will apply to all email for your account." Under that you have an "Account" options field which lists all of the domain POP3 email accounts you have created thus-far. You must have at least one domain email address to use these filter options.
- The top option in the email account list is "All email addresses" - which I always select. Otherwise, you have to pick and choose which email address each filter applies to. When I'm blocking spammers, scammers and hostile messages, I don't dink around with only protecting one mailbox. It will also let the existing filters apply to any new accounts you create down the line.
- Next, click on "Filter Name" and type in a meaningful name for your filter. The rule name must be unique, otherwise you will overwrite the previous filter with that same name. For testing purposes, name your first filter "Weight Loss Scams."
- Click on the button labeled "Next." This opens up two previously hidden input sections, labeled: "Rules" and "Actions." We'll start by explaining the available rules.
- Rules have two flyout (multi-choice) input fields. The first defines the portion of the email message that the rule applies to. The second field determines how the text is matched to trigger your filter. The default first rule is "From" and the default second field is "equals." I have 13 options for the first field (From, etc) and 10 for the second field (equals, etc.). I usually think of these two fields as Section (the part of the email that is being tested) and Criteria (how it is matched or not matched).
- For the purpose of creating a test filter, leave "From" selected and type in:
(?i)(weight\s?loss|Dr\.\s?Oz)
under the From field. Change the second Criteria field to "matches regex" instead of "equals." - Click on the + button on the right side of the first rule to open another set of input selectors. This time, choose "Subject." Make sure that the radio option "OR" is selected for the second rule.
- In the Subject input box, type, or copy and paste:
(?i)diet|nutrition|weight\s?loss|(burn|lose)\s(fat|pounds|weight)
Again, change the second Criteria field to "matches regex" instead of "equals." - Under the "Actions" field, select "Discard message" then click on the "create rule" button.
If all goes well, a few seconds after you pressed create rule, the new rule should appear in the list of email filters, with the label you typed in at the beginning. If you have made all of the option choices I listed, any new email that matches either the From field or Subject terms will be discarded (permanently deleted off the mail server with no record or warning). The terms that are matched by my RegEx are as follows.
- From contains (case insensitive):
- Weight Loss
- WeightLoss
- Dr.Oz
- Dr. Oz
- The Subject contains (case insensitive):
- diet
- dietitian
- nutrition
- nutritionist
- weight loss
- weightloss
- burn fat
- burn pounds
- burn weight
- lose fat
- lose pounds
- lose weight
Feel free to edit this filter by clicking on the "Edit" button to the right of the filter name. You can either add new terms to exiting rules, or click on the last + to add another rule field, type in your additions, then click the "update rule" button. The new conditions will be added and saved
All of these forged senders and sucker bait subjects are blocked by two fairly short regular expressions filters. The expression (?i) tells the RegEx interpreter to ignore letter cases, meaning either capital or lowercase letters will match. The \s means a space exists. A \s? means a space may or may not exist. This can be altered to include dashes or other single characters between words by simply substituting .* for \ or \s?. The .* means zero or one of any character or space. A question mark in a regex means may or may not exist. The | symbol is used to separate individual terms to match using OR, as in: A|B|C - meaning A or b or C.
If you want to delve deeper into regular expressions, you can learn about RegEx here
My next article on RegEx will show you how to block IP addresses and ranges and particular spam domains.
If you like this article please share it.
The content on this blog may be reprinted provided you do not modify the content and that you give credit to Wizcrafts and provide a link back to the blog home page, or individual blog articles you wish to reprint. Commercial use, or derivative work requires written permission from the author.