Blog Archives

Ban Web Server Traffic

Web Server Traffic Should be Banned

Opinions will differ about putting a ban on web server traffic. There are those who want their blogs and websites free from malicious activity, safe and secure for genuine valuable visitors. Then there are those who think there should be no restrictions on web traffic and activity (some even think spam is not bad).

Let’s clarify the web traffic we’re talking about. We’re not talking of banning referer traffic i.e. traffic from good back-links from websites resulting in genuine visitors.

Read the rest of this entry

Panopta.com Pest Bot

Checks.Panopta.com – nuisance bot

Panopta.com calls itself “Uptime Management Software for Hosting Providers, SaaS Providers, IT Managers, and Website Owners”.  Well, there’s nothing wrong with the idea. If your website is critical to your business it’s not a bad thing to get alerted if or when your site is offline.

But Did You Subscribe to Panopta.com?

Panopta.com monitoring service allows other people to monitor your website! That’s right, you don’t have to sign up for their service for your domain to be monitored. This means a business competitor can monitor your website status without your permission!

Read the rest of this entry

Bork-Edition User Agent

Opera User Agent “Bork-Edition”

bork-edition spam bot iconHave you seen Bork-edition user agent strings? Wondered what browser uses this string? Maybe noticed nearly all traffic to your site with Bork edition in the user agent string is spam and hacking attempts. User agents with Bork-edition are considered by at least one writer among the top 10 spam bots that must be blocked.

There’s several user agents which on first glance look harmless e.g. user agent string Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.0) Opera 7.02 Bork-edition [en]

Read the rest of this entry

Bing Banned

Bing and MSN Bots Are Banned

I have banned Bing, Yahoo and MSN search engine spiders from my sites! I’m tired of the constant rule breaking and over-crawling by Bing and MSN search bots.

Bing is a Rule Breaker

Microsoft claims Bing honours robots.txt rules. In my experience that is a blatant lie. Bingbot / msnbot simply ignore robots.txt rules and crawl whatever they want. Some of the specific rules broken include;

  • crawling system folders
  • crawling image folders (msn-media bot). Image folders and extensions jpg,  png,gif, bmp are disallowed
  • crawling RSS feeds. All RSS feeds are disallowed; rss.xml, /feed/, etc
  • crawling comment forms; DOMAIN/comment/184 – the path /comment/ is disallowed in robots.txt

The last straw was today. 2 days ago I added Bing and MSN user agent strings to disallowed bots in robots.txt across all my sites; this morning I see these bots read robot.txt then ignored it totally, and crawled the sites anyway.

Have You Seen Bad Activity by Bing?

Read the rest of this entry

Rulebreaker Bingbot and MSN

Bingbot and MSN bots are Rulebreakers

I’m tired of watching Bingbot and msnbot breaking rules and crawling disallowed files, folders and paths. Microsoft say their bots obey robots.txt rules – They don’t. Bingbot/msnbot occasionaly read the robots.txt file, then immediately afterwards continue on to crawl items specifically listed e.g;

1st Rule Breaker Example – Comments

image of bing and msn bot logoComment paths are disallowed:

  • Disallow: /comment/
  • Disallow: /*/comment/
  • Disallow: /comment/reply/

And the result, Bing crawls these paths

  • 157.56.93.219   /comment/169   2004/07/13  14:24
  • 157.56.93.219   /comment/179   2004/07/13  14:25
  • 157.56.93.219   /comment/201   2004/07/13  14:26

Read the rest of this entry

FreeWebMonitoring SiteChecker/0.1

Hacker Bot FreeWebMonitoring SiteChecker/0.1 Pays a Visit

Hacker Bot FreeWebMonitoring SiteChecker/0.1 iconBad bot “FreeWebMonitoring SiteChecker/0.1 (+http://www.freewebmonitoring.com)” paid a visit to one of my websites yesterday from IP address 184.107.201.242 which belongs to Canadian service provider: Canada Montreal Thst Golf Inc.

The full range of IP’s owned by Canada Montreal Thst Golf Inc. is 184.107.0.0 – 184.107.255.255

This bot is not the bot used by freewebmonitoring.com. Their bot is “FreeWebMonitoring SiteChecker/0.2 (+http://www.freewebmonitoring.com/bot.html)”

Read the rest of this entry