Bingbot and MSN bots are Rulebreakers
I’m tired of watching Bingbot and msnbot breaking rules and crawling disallowed files, folders and paths. Microsoft say their bots obey robots.txt rules – They don’t. Bingbot/msnbot faithfully reads robots.txt, then immediately afterwards continues on to crawl items specifically listed…
1st Rule Breaker Example – Comments
Comment paths are disallowed:
- Disallow: /comment/
- Disallow: /*/comment/
- Disallow: /comment/reply/
And the result, Bing crawls these paths
- 188.8.131.52 /comment/169 2004/07/13 14:24
- 184.108.40.206 /comment/179 2004/07/13 14:25
- 220.127.116.11 /comment/201 2004/07/13 14:26
Now we Get Auto Hyperlinks – Bad News
Text gets turned into hyperlinks automatically. I just discovered this annoying thing that’s part of the latest version of WordPress used by WordPress.com – WordPress 3.5. Type the text for a URL and the darn thing turns into a hyperlink when published. That’s right, you don’t have to click on the link function in the editor, so no options to add target info and title… No options not to create the hyperlink… Arrgghhh!
Maybe it’s handy for the terminally lazy, but it’s bad news for SEO. And what about the bloggers who write about malware and bad websites, and want to tell readers about these bad addresses? They don’t want visitors to click a hyperlink, just want to inform people about the bad address. With auto-hyperlinks the information becomes an active link!
For example, this hacker information “Exploit attempt on WordPress GD Star Rating plugin”
Website Loads 10 Times Faster After Hosting Change
One of my sub-sites loads 10 times faster after moving the domain to an offshore server. To be totally fair and put the improvement in perspective, the actual server is not that much faster; the big difference is route latency or lag.
Before moving the average time it took Google-bot to load a page from this site was around 1100 ms. Now, a month later we can see the improvement – average time is about 100 ms..
GNAX Hosting – So Far So Good
Last week I moved my domain graphicline.co.za to GNAX VPS hosting. I’ve watched Google page load times get shockingly poor the past four months. Nothing I’ve done on-site to improve performance has made any difference. I’d already tried several caching systems and offloaded some files to a CDN and other fast servers – with no improvement.
Eventually, after trying everything else, the only conclusion I could draw was the long path bottleneck between Google’s Mountain View servers and the data centre servers hosting my domain was the main culprit in the time it took for Big G to load pages.
Average page loads for 2 of the sites (WordPress) on the domain had gone from under 2.5 seconds in May to over 4 seconds in August and over 5 by September, while the main site (Drupal) was approaching 4 seconds from under 2 in May. Minimum page load speed had got to nearly 4 seconds for one site by September.
CSS Fix for Text Widget Without Title
I previously wrote about using a Text Widget without a Widget Title or Name as a way to gain a small improvement in SEO for a WordPress.com blog (see WordPress Widget Headings) by using a lower order <h> attribute than the <h3> standard mark-up used by the theme.
Some themes play nicely when the title is blank by collapsing the space used by the title if it is missing. More themes however keep the space intended for the title resulting in a large blank space that spoils the appearance of the blog. This can be fixed fairly easily using CSS styles to reposition the widget content.
Cache Pre-load Improves Google Page Load
Using a cache pre-load system can improve Google crawl page load speed substantially as clearly shown in the infographic below. Google considers page load in it’s SERP algorithm as an indicator of site quality: Where two similar ranked sites exist, the site with faster load speed will usually get better SERP than a slower site. With this in mind surely it’s a good idea to make the effort to improve page load speed as much as possible.
Page load speed can be improved in a number of ways; moving the site to a better hosting service, optimising the site technically, including getting rid of unnecessary plugins, keeping image size as small as possible, and using an effective caching system are some of the things we can do.
No matter how well all the other technical aspects are improved, caching the site, and especially pre-loading the cache, will make a big difference to page load speed.
Another Spam Scam – Fix this Message
“If you are the owner of the site, you can fix this message by publishing…” is appearing all over blog comment forms. The spammer would have the blogger believe there is an error message somewhere on the site, and publishing the contents of the comment will some-how fix the supposed problem…
Mysteriously fix the Error Message
Publish the comment and the problem with the site is gone! Wow – as easy as that. No checking code files or testing plugins, all your problems are solved if you are the owner of the site… Publish the comment and you can fix this message. So simple.
Of course this is a spammer trying to get the link to some trash site published, hoping to attract click-throughs to the site, hoping to sell some rubbish product like cheap black-market Viagra or install malware on the visitor’s computer, steal personal information such as your banking details. Are we really that naïve – I don’t think so.