Access log "Referer" spam still happening through 2011
Takeaway:
I write about a lot of different types of spam, but one of the oldest, next to email and USENET, is spamming the "REFERER" field on a website's raw access logs. I have been seeing this form of spam for over a decade now.
What is a raw access log?
Websites are usually setup or configured to generate a text or graphical log of all visits to those sites (a.k.a: "hits"). These logs contain information that is useful to Webmasters of the websites. Graphical access logs use pie or column charts to show where the hits are coming from, who sent them to you, what details they were searching for and other useful facts about each request. A "raw access log" presents these details in plain text format, in space-separated groups.
Why would anybody want to spam a website's raw access logs?
Over a decade ago, spammers learned that some website owners, or free hosting companies, or individuals hosting their own web servers at home (usually against T.O.S) were actually publishing their raw access logs so that the owners could read them in a web browser, from anywhere they might be. Most of these published access logs are not password protected, meaning anybody anywhere can view them, if they know the location of those website log files. Since so many people do not understand website security at all, they leave configurations in a default state. This means that if their raw access logs are published, the folder location will be predictable, based upon the operating system of the web server. That web server is usually the Apache Web Server.
Thus, when spammers began seeing website raw access logs that were in default folder locations, on various web servers, they could read them in their browsers, as could anybody else in the World who reads that language. So, some enterprising S.O.B. came up with the brilliant idea of posting a request for some files on some websites, and they decided to include fake "referrer" details.
What is the referrer field in an Access log?
The referrer field is a section of an access log that tells the owner/maintainer of the website where each visitor came from, just before they came to your website. In other words, who referred them to you. This information is extremely valuable for learning who links to your web pages, or is writing about you, or has found your site by means of a search engine result.
What do spammers do to referrer fields to turn them into spam?
Instead of revealing the actual referring page location of the website that the visitor (human or machine) was visiting when they decided to come yours, spammers use special web software programs to create whatever content they wish to present for the referer field. That special content usually takes to form of spammy links containing the names of illicit goods (illicit prescription drugs, counterfeit goods), or services (shady or illegal businesses).
Did I just misspell "referrer" as "referer?"
Nope. When the original Apache Web Server documentation was written, back in 1945, the scientists working on it accidentally misspelled the word Referrer as Referer. This misspelling has stayed with us to this very day!
Now, on to the rest of the details about Referer spam.
Most raw access logs contain the following details:
- IP address of the visitor
- Date and time of the requested resource
- Method (GET, POST, HEAD, etc)
- Requested Folder (just "/" means default index page)
- Requested file name and extension
- HTTP type (1.0 or 1.1)
- Server Response Code (200=Okay, 403=Forbidden, 404=Not Found, 500=Oops - I broke it)
- Size of file in bytes
- REFERER (What this is all about.)
- User Agent of the visitor (browser name and version and computer OS, search engine robot details, exploit tool, spambot)
When spammers post spam links in the faked Referer field as they visit your website, they are hoping against the odds that your hosting company is foolish enough to allow your access logs to be published without any credentials required to view the log. They (spammers) use cheap labor, or "bots," or automated web scripts to post spam links to as many websites as they have listed in their databases, which are sold on underground spam forums. Some spammers actually compile their own lists by searching for published raw access logs on Google, Yahoo, Bing and other search engines. Since those logs are publicly viewable, they are also detectable and index-able by search engine crawlers.
Take Action!
If you are a webmaster, or own a website, and your access logs are publicly viewable, without a username and password, learn how to either protect them from the public, or turn off their publication altogether. Spammers may continue to post spam links to your referer field, but nobody will see those links - which is how it should be. Do your part in denying an audience to spammers, no matter what type of spam they try to post.
Epilogue:
Whether spam is sent by email, or posted to Facebook, Twitter, or a blog, or an access log, it is still pure garbage. Most of it promotes dangerous illicit prescription drugs that are made in India and other countries in Asia, where the quality and content controls are lax, compared to those in the US and Canada and most other Western nations. Some log spam promotes counterfeit goods, pirated software, porn sites, online casinos, underground forums and ripoff sites hawking loans. Don't let your access logs assist spammers in their criminal pursuits!
If you like this article please share it.
The content on this blog may be reprinted provided you do not modify the content and that you give credit to Wizcrafts and provide a link back to the blog home page, or individual blog articles you wish to reprint. Commercial use, or derivative work requires written permission from the author.