July 9, 2017

Use RegEx to filter spam from your mail server - part 2

July 9, 2017

On the 4th of July I wrote an article explaining how you can use Regular Expressions (RegEx) to create spam filters that can be applied to a mail server for your commercially hosted domains. This article shows how to create RegEx filters to block spam based on the IP addresses of the mail servers found in the headers of incoming emails.

If you haven't read the first article in this series, I recommend you do so now. It has lots of important information that this article builds upon. It will open in a new tab so you can refer to it as necessary.

Email messages contain a section that is normally hidden from view when you read the body text. It is called the email headers and they contain the actual routing details for each incoming and outgoing message. Some of those details can be forged by spammers and frequently are. But, others are not easily forged, including certain numeric entries that relate to the IP addresses of the email servers through which the message has passed.

So, without any further ado, let's look at a spam filter to block unwanted IP addresses.

The following header details came from a spam email foisting counterfeit sunglasses from China. I don't care to receive any email from China at this point in time and these spam messages for counterfeit goods never contain an unsubscribe link. I have replaced personally identifiable details with the word REDACTED.

Return-path: <[email protected]>
Envelope-to: REDACTED
Delivery-date: Sun, 09 Jul 2017 03:59:20 -0600
Received: from [47.94.42.47] (port=2712 helo=qq5.wolegequ.co)
by REDACTED.com with esmtp (Exim 4.87)
(envelope-from <[email protected]>)
id 1dU8zo-002MYl-1T
for REDACTED; Sun, 09 Jul 2017 03:59:20 -0600
Message-ID: <[email protected]>
From: "bxjoag" <[email protected]>
To: REDACTED
Subject: Ray Ban Sunglasses sale with 80% discount REDACTED
Date: Sun, 9 Jul 2017 17:59:08 +0800
MIME-Version: 1.0
Content-Type: text/html;
charset="utf-8"
Content-Transfer-Encoding: base64
X-mailer: Pkx 4

In the above headers, the line beginning with "Received: from [47.94.42.47]" contains the IP address of the mail server that delivered the email to my hosted domain's email system. If I copy the numbers 47.94.42.47 and paste them into the IP input box at tcpiputils.com and submit it, the results show that the IP is registered to a company in Hangzhou, Zhejiang, China. Additionally, they reveal that this IP is part of a very large range of 262,142 IPs, known as a CIDR, ranging from 47.92.0.0 through 47.95.255.255, which is designated in CIDR shorthand notation as 47.92.0.0/14. The entire CIDR is assigned to China.

Say you want to create a spam filter that will block just that one IP address. Here is how you would do that. First, you would create a new email filter by logging into your (Apache server based) hosting account, then follow the click route to "cpanel > Email > Account Filtering > Email Filters > New Filter."

Select "All email addresses" then type in a name for the filter, like "Block Chinese IPs."
First Rule = Any Header > Matches Regex:
Received:\ from\ \[47\.94\.42\.47\]
Actions: Discard
Click on Create Filter and it should be saved to your Filters list.

If you want to block all 262,142 IPs within 47.92.0.0 through 47.95.255.255, you'll need to use slightly advanced RegEx, thusly:
Received:\ from\ \[47\.9[2-5](\.\d{1,3}){2}\]

Here is an explanation of the two RegEx filters.
"Received: from\ " is the beginning of the pertinent line in the header. The \(space key) is how one designates a blank space. Most RegEx interpreters require spaces to be "escaped" by using the backslash character. While some interpreters won't mind you just typing a space with the space key, some may throw an error, ignoring all or part of your filter. Play it safe and escape your blank spaces with a \ .

Next, \[ and \] is how you escape the bracket characters if you need to use them literally. Since the opening and closing brackets are used in the IP address in the header, you must escape them with backslashes before each of them. The reason you need to escape the brackets is because they also have a particular meaning to the RegEx interpreter. Brackets around numbers, letters, or other characters, mean they contain a range of whatever is between the opening and closing brackets.

The numbers making up the IP single address are actual numbers found in the header, so they can be pasted in as found. But, the dots between groups of numbers have a meaning to the RegEx engine (periods indicate zero or one of any character). To avoid mistaken matches, we must escape those dots (periods) with a leading backslash, like this: \.

Next, in the second filter that encompasses the entire /14 CIDR, the numbers making up a range are within brackets. Here again is that expression: 47\.9[2-5](\.\d{1,3}){2} That range in our example is from 92 through 95, and is coded using: 9[2-5]. This covers the number 9, in combination with any number from 2 through 5. We could also write the actual numbers inside the brackets as [2345], but I find the shorthand with a dash between the lowest and highest numbers much easier to write.

Finally, the expression \.\d{1,3}{2} translates to "a dot, followed by one, two or three numeric digits, twice" - because \d means one numeric digit. The curly brackets encompass a multiplier or multiplier range for whatever immediately precedes the left curly bracket, in this case, 2 times. The reason I used the shorthand method of any digits one through three times is that when dealing with IPv4 IP addresses, all available numbers range from 0 through 255. It is much simpler to write \.\d{1,3}{2} than the long form: ([0-1][0-9]{0,2}|2[0-5]{0,2})\.([0-1][0-9]{0,2}|2[0-5]{0,2}). One has 13 characters while the other has 61 to accomplish the same results.

You can add more "Received: from" IP lines as add-on rules by using the + symbol on the right of the last rule and choosing the correct new conditions and expressions. Always save your existing filters before making additions, in case you make a mistake, or the changes don't take (it happens on some cpanels).

What we've learned today

Regular expressions can be used to match numbers making up IP addresses of spamming mail servers.

Coming up in the next installment, I will show you how to combine numeric rules into one long line of code and how to safely edit them.

Facebook Twitter LinkedIn Pinterest Instapaper Google+ Addthis

back to top ^

July 4, 2017

Use RegEx to filter spam from your mail server - part 1

July 4, 2017

For years now, I have been writing and publishing spam filters for MailWasher Pro, which is a desktop POP3 and IMAP email filtering program. My filters are very effective at flagging or deleting spam, scams and malware links or attachments. That's great if you use MailWasher Pro. But, if you don't use a spam filtering program and are using your own hosted domain for email, which you read in a desktop email client (not browser based Webmail or Gmail), my regular expressions email filters may protect you from spam threats.

First, the term "regular expressions" is usually abbreviated as: "RegEx" - which is how I will refer to them from henceforth in this article. They are characters and formatting that can match all manner of words, numbers, HTML codes, and even empty typed spaces and line feeds. While I usually write RegEx codes by hand, I always test them in a program called Regex Match Tracer. If you want to play around in RegEx land, get a copy of Match Tracer to find and fix errors before you upload them.

Even though I am a long time MailWasher Pro registered user and supporter, there are just some types and sources of email that I don't even care to see in MailWasher's Recycle Bin (you can restore accidentally or misinterpreted email that you deleted from the built-in Recycle Bin). Some are repeat spam senders, or Chinese or Russian senders who mistakenly think I give a crap about their counterfeit pills or dating scams. Still others are sent from botnets I have already identified and blocked by certain lines in their email headers.

Get it? Got it? Good! Let's move on to some examples of my own RegEx filters that I use on my mail server for my Bluehost web hosting account.

When you want to block any email from staying available for downloading from your hosted mail server, you have to either send it to the bit bucket (aka: Discard, Send to Dev Null), or Fail (with or without a message to the sending domain). In lieu of deleting a message, you have the option of redirecting it to another email account, which can even be a Gmail account. So, when I create a new mail filter under my hosting cpanel, I have to decide what to do with those messages matching my criteria.

Sometimes, the safest thing to do at first is to redirect those filtered messages to an account you created only to receive "junkmail." Then you will know if the filter is working when certain email messages begin appearing in that account, shortly after you created the filter to catch them. After a few days of testing to ensure that only the email you targeted is getting redirected, change the action to delete, or fail.

Let's look at the various fields that are available for "email filtering" on a cpanel of an Apache server hosted web account. I'll use a numbered list to keep the steps in order. I'm using the cpanel options available for shared hosting accounts on Bluehost. Yours may vary if you use a different web hosting company, or run a dedicated server.

  1. Log into your domain's "hosting login" account; not the webmail login.
  2. If you don't automatically arrive at cpanel, click on the link or tab labeled "cpanel."
  3. Mouse down to the "email" section, then click on the icon labeled "Account Filtering."
  4. When the Account Filtering page opens, click on "email filters." This displays a list of all email filters you have created, if any, under the heading: Email Filters."
  5. At the bottom of the list of filters (if any) find the green button labeled "new filter" and press it.
  6. The following items will appear, starting with the text: "Create email message filters that will apply to all email for your account." Under that you have an "Account" options field which lists all of the domain POP3 email accounts you have created thus-far. You must have at least one domain email address to use these filter options.
  7. The top option in the email account list is "All email addresses" - which I always select. Otherwise, you have to pick and choose which email address each filter applies to. When I'm blocking spammers, scammers and hostile messages, I don't dink around with only protecting one mailbox. It will also let the existing filters apply to any new accounts you create down the line.
  8. Next, click on "Filter Name" and type in a meaningful name for your filter. The rule name must be unique, otherwise you will overwrite the previous filter with that same name. For testing purposes, name your first filter "Weight Loss Scams."
  9. Click on the button labeled "Next." This opens up two previously hidden input sections, labeled: "Rules" and "Actions." We'll start by explaining the available rules.
  10. Rules have two flyout (multi-choice) input fields. The first defines the portion of the email message that the rule applies to. The second field determines how the text is matched to trigger your filter. The default first rule is "From" and the default second field is "equals." I have 13 options for the first field (From, etc) and 10 for the second field (equals, etc.). I usually think of these two fields as Section (the part of the email that is being tested) and Criteria (how it is matched or not matched).
  11. For the purpose of creating a test filter, leave "From" selected and type in: (?i)(weight\s?loss|Dr\.\s?Oz) under the From field. Change the second Criteria field to "matches regex" instead of "equals."
  12. Click on the + button on the right side of the first rule to open another set of input selectors. This time, choose "Subject." Make sure that the radio option "OR" is selected for the second rule.
  13. In the Subject input box, type, or copy and paste: (?i)diet|nutrition|weight\s?loss|(burn|lose)\s(fat|pounds|weight) Again, change the second Criteria field to "matches regex" instead of "equals."
  14. Under the "Actions" field, select "Discard message" then click on the "create rule" button.

If all goes well, a few seconds after you pressed create rule, the new rule should appear in the list of email filters, with the label you typed in at the beginning. If you have made all of the option choices I listed, any new email that matches either the From field or Subject terms will be discarded (permanently deleted off the mail server with no record or warning). The terms that are matched by my RegEx are as follows.

  • From contains (case insensitive):
    1. Weight Loss
    2. WeightLoss
    3. Dr.Oz
    4. Dr. Oz
  • The Subject contains (case insensitive):
    1. diet
    2. dietitian
    3. nutrition
    4. nutritionist
    5. weight loss
    6. weightloss
    7. burn fat
    8. burn pounds
    9. burn weight
    10. lose fat
    11. lose pounds
    12. lose weight

Feel free to edit this filter by clicking on the "Edit" button to the right of the filter name. You can either add new terms to exiting rules, or click on the last + to add another rule field, type in your additions, then click the "update rule" button. The new conditions will be added and saved

All of these forged senders and sucker bait subjects are blocked by two fairly short regular expressions filters. The expression (?i) tells the RegEx interpreter to ignore letter cases, meaning either capital or lowercase letters will match. The \s means a space exists. A \s? means a space may or may not exist. This can be altered to include dashes or other single characters between words by simply substituting .* for \ or \s?. The .* means zero or one of any character or space. A question mark in a regex means may or may not exist. The | symbol is used to separate individual terms to match using OR, as in: A|B|C - meaning A or b or C.

If you want to delve deeper into regular expressions, you can learn about RegEx here

My next article on RegEx will show you how to block IP addresses and ranges and particular spam domains.

Facebook Twitter LinkedIn Pinterest Instapaper Google+ Addthis

back to top ^

July 2, 2017

Protect your hosted websites from hackers with my .htaccess blocklists

July 2, 2017

If, like me, your website is on a shared hosting account, you can block unwanted traffic via an IP blocklist in your .htaccess file. This could be from hackers, scammers, spammers or automated probes for unpatched exploitable files.

What is .htaccess?

The file named .htaccess is a normally hidden server configuration file used by Apache web servers. Since most of the shared hosting websites run on this open source Apache software, the .htaccess file lets the webmaster control access to all or parts of the website under his or her personal control. The leading dot in the file name tells the Apache server that it is a special control file and to hide it from standard view. If you use a desktop FTP program to upload files to your website, you will have to find the settings option to show hidden files.

Read detailed information about how to use .htaccess here

Before you read any further, note that when editing or creating a .htaccess file, one incorrect or misplaced character or misspelled word, or even a missing required space can cause "Server 500" error that locks everybody out from viewing the website from the Internet, including you! Extreme caution and immediate followup online testing is required when altering a .htaccess file.

One of the important things you need to know when editing a .htaccess file is that personal comments and notes that are not actual commands must be proceeded with a # character at the start of every unwrapped line of text, or after you press Enter to create a new line (or paragraph) of text. You cannot just type in notes without prefixing them with the # character or you will cause a Server 500 lockout error.

Example of a properly formatted .htaccess personal note or comment:
# This is a note to myself. The following directives will block Chinese traffic

You must also learn which spellings and directives (aka commands) are allowed and which are not. A misspelled directive won't just be ignored. It will cause a Server 500 error. Note that some web hosting companies may not allow you to create or alter a .htaccess file without their express permission (then call or email them). Fortunately, those are few and far between. I can tell you with direct knowledge that Bluehost allows individual .htaccess files to be created and edited.

So, assuming you know how to safely edit your .htaccess file, let's delve into how my IP blocklists can help protect your Apache server shared hosting website from online hacks and probes.

I compile and use several IP blocklists to protect my own and some of my friends' websites from unwanted or outright hostile traffic. The very first one I created was the Nigerian Blocklist, which was and still is used to keep Nigerian 419 scammers from signing up for accounts, then attempting to scam members of the world famous Steel Guitar Forum. My ability to display and decipher email headers played a large part in creating that blocklist.

My second and third blocklists were the Russian and the Exploited Servers .htaccess blocklists. They were developed after I began reading my own website access logs and learning that all kinds of badness and log spamming was originating from Russian IP addresses and also from bad web hosts that rented out dedicated servers to shady operators, while turning a blind eye to SpamCop and Spamhaus reports.

The next list I developed is the most visited one yet: the Chinese Blocklist. I'd say you wouldn't believe how many vulnerability probes and hacking attempts come out of Chinese IP space, but if you're reading this you probably already know this. It has gotten to the point that I had to write special .htaccess conditions to detect certain Chinese probes and automatically add them to a list of banned IPs, which I then research for their assigned CIDRs and added to the Chinese Blocklist..

The last blocklist I developed is the LACNIC Blocklist, which deals with South American as well as Mexican and Panamanian IP addresses. This blocklist began after I discovered a huge amount of badness hitting my access logs that came from Panama servers and infected Brazilian ISP customers. All of the IP addresses in this blocklist are registered to the LACNIC, which stands for "Latin America and Caribbean Network Information Centre." As of late, most additions to this blocklist are Brazilian IP addresses.

All of the above .htaccess blocklists are available in two different formats. The original blocklist files are for Apache servers up to version 2.2.3. The others cover the newer versions of Apache, from 2.4 onward. The directives in the original versions are not guaranteed to be compatible with a newer version of Apache, unless the host has included a particular module that bridges the old and new .htaccess directives. Before you attempt to include any of my blocklists into your .htaccess, ask your web host's support department, or log into your cPanel to see what version of Apache your site is running on.

You can see the different directives for older and newer versions of Apache by reading the beginning of each line of IP addresses in the original version of the Chinese Blocklist vs. the newer version, with the file name chinese-blocklist_2_4.html. The older version uses "deny from" whereas the newer version uses "Require not ip."

What happens when someone from an IP on a blocklist visits your website?

When a request for a web asset comes in from any IP that is within a blocklisted CIDR (an often large range of hundreds, thousands, or even millions of IPs), they will receive a Server 403 Forbidden response. If you have setup a custom 403 page, it will be shown to that visitor. The way my published blocklists are configured, all files under the directory and sub directories in which that .htaccess resides will be forbidden. That means that if you place the blocklist directives inside the main .htaccess in your web root (e.g., public_html), all files and folders will be affected. You are free to edit the <Files directive to only block certain directories, or file types, or to place the blocklist .htaccess inside a particular director where it will block just that folder tree.

The addition of an IP blocklist adds a layer of defense against bad bots, script kiddies and automated probes. It cannot stop a determined human hacker who can hide behind a variety of IP proxies, many of which are not on a blocklist (yet). A lot of these probes are for unpatched exploitable CMS software, like Joomla, WordPress, some shopping carts and guestbooks. If you are using any of these PHP driven 1-Click install scripts and are not making sure they are updated (better yet, auto-updated) as soon as vulnerabilities are announced, your website will likely get compromised.

If you find this information useful, please consider making a donation for my efforts.

Facebook Twitter LinkedIn Pinterest Instapaper Google+ Addthis

back to top ^

Blog Links

Sponsored Message

I recommend Malwarebytes to protect your computers and Android devices from malicious code attacks. Malwarebytes detects and blocks spyware, viruses and ransomware, as well as rootkits. It removes malware from an already infected device. Get an 18 month subscription to Malwarebytes here.

If you're a fan of Robert Jordan's novels, you can buy boxed sets of The Wheel Of Time, here.

As an Amazon and Google Associate, I earn commissions from qualifying purchases.


CIDR to IPv4 Address Range Utility Tool | IPAddressGuide
CIDR to IPv4 Conversion



About the author
Wiz FeinbergWiz's Blog is written by Bob "Wiz" Feinberg, an experienced freelance computer consultant, troubleshooter and webmaster. Wiz's specialty is in computer and website security. Wizcrafts Computer Services was established in 1996.

I produce this blog and website at my own expense. If you find this information valuable please consider making a donation via PayPal.

Follow @Wizcrafts on Twitter, where I post short updates on security issues, spam trends and things that just eat at my craw.

Follow Wizcrafts on Twitter


Malwarebytes' Anti-Malware is the most frequently recommended malware removal tool in malware removal forums, like Bleeping Computers. It is extremely effective for removing fake/rogue security alerts, Bots, Spyware and the most prevalent and current malware threats in the wild. Learn about Malwarebytes Anti-Malware.


MailWasher Pro is an effective spam filter that protects your desktop email client. Using a combination of blacklists and built-in and user configurable filters, MailWasher Pro recognizes and deletes spam before you download it. MailWasher Pro reveals the actual URL of any links in a message, which protects you from most Phishing scams. Try it free for 30 days.





Creative Commons License This weblog is licensed under a Creative Commons License.
The content on this blog may be reprinted provided you do not modify the content and that you give credit to Wizcrafts and provide a link back to the blog home page, or individual blog articles you wish to reprint. Commercial use, or derivative work requires written permission from the author.
Powered by Movable Type

back to top ^