“For Microsoft, any limits imposed on Google might help it improve the fortunes of its struggling search engine, Bing.”
While Nick and Eric’s article is more about Google than Bing, the statement does make us think about Bing as a search engine…
Poor Indexing, Poor Search Results
For the average web searcher Bing, along with the sister Microsoft search engines Yahoo and Search MSN, provides a poor experience to internet users.
Bing and MSN Bots Are Banned
I have banned Bing, Yahoo and MSN search engine spiders from my sites! I’m tired of the constant rule breaking and over-crawling by Bing and MSN search bots.
Bing is a Rule Breaker
Microsoft claims Bing honours robots.txt rules. In my experience that is a blatant lie. Bingbot / msnbot simply ignore robots.txt rules and crawl whatever they want. Some of the specific rules broken include;
- crawling system folders
- crawling image folders (msn-media bot). Image folders and extensions jpg, png,gif, bmp are disallowed
- crawling RSS feeds. All RSS feeds are disallowed; rss.xml, /feed/, etc
- crawling comment forms; DOMAIN/comment/184 – the path /comment/ is disallowed in robots.txt
The last straw was today. 2 days ago I added Bing and MSN user agent strings to disallowed bots in robots.txt across all my sites; this morning I see these bots read robot.txt then ignored it totally, and crawled the sites anyway.
Bingbot and MSN bots are Rulebreakers
I’m tired of watching Bingbot and msnbot breaking rules and crawling disallowed files, folders and paths. Microsoft say their bots obey robots.txt rules – They don’t. Bingbot/msnbot occasionaly read the robots.txt file, then immediately afterwards continue on to crawl items specifically listed e.g;
1st Rule Breaker Example – Comments
Comment paths are disallowed:
- Disallow: /comment/
- Disallow: /*/comment/
- Disallow: /comment/reply/
And the result, Bing crawls these paths
- 18.104.22.168 /comment/169 2004/07/13 14:24
- 22.214.171.124 /comment/179 2004/07/13 14:25
- 126.96.36.199 /comment/201 2004/07/13 14:26