Blog Archives
Rulebreaker Bingbot and MSN
Bingbot and MSN bots are Rulebreakers
I’m tired of watching Bingbot and msnbot breaking rules and crawling disallowed files, folders and paths. Microsoft say their bots obey robots.txt rules – They don’t. Bingbot/msnbot faithfully reads robots.txt, then immediately afterwards continues on to crawl items specifically listed…
1st Rule Breaker Example – Comments
Comment paths are disallowed:
- Disallow: /comment/
- Disallow: /*/comment/
- Disallow: /comment/reply/
And the result, Bing crawls these paths
- 157.56.93.219 /comment/169 2004/07/13 14:24
- 157.56.93.219 /comment/179 2004/07/13 14:25
- 157.56.93.219 /comment/201 2004/07/13 14:26
Auto Hyperlinks
Now we Get Auto Hyperlinks – Bad News
Text gets turned into hyperlinks automatically. I just discovered this annoying thing that’s part of the latest version of WordPress used by WordPress.com – WordPress 3.5. Type the text for a URL and the darn thing turns into a hyperlink when published. That’s right, you don’t have to click on the link function in the editor, so no options to add target info and title… No options not to create the hyperlink… Arrgghhh!
Maybe it’s handy for the terminally lazy, but it’s bad news for SEO. And what about the bloggers who write about malware and bad websites, and want to tell readers about these bad addresses? They don’t want visitors to click a hyperlink, just want to inform people about the bad address. With auto-hyperlinks the information becomes an active link!
For example, this hacker information “Exploit attempt on WordPress GD Star Rating plugin”
Hosting Change – Ten Times Faster
Website Loads 10 Times Faster After Hosting Change
One of my sub-sites loads 10 times faster after moving the domain to an offshore server. To be totally fair and put the improvement in perspective, the actual server is not that much faster; the big difference is route latency or lag.

Before moving the average time it took Google-bot to load a page from this site was around 1100 ms. Now, a month later we can see the improvement – average time is about 100 ms..
GNAX Hosting – Early Results
GNAX Hosting – So Far So Good
Last week I moved my domain graphicline.co.za to GNAX VPS hosting. I’ve watched Google page load times get shockingly poor the past four months. Nothing I’ve done on-site to improve performance has made any difference. I’d already tried several caching systems and offloaded some files to a CDN and other fast servers – with no improvement.
Eventually, after trying everything else, the only conclusion I could draw was the long path bottleneck between Google’s Mountain View servers and the data centre servers hosting my domain was the main culprit in the time it took for Big G to load pages.
Average page loads for 2 of the sites (WordPress) on the domain had gone from under 2.5 seconds in May to over 4 seconds in August and over 5 by September, while the main site (Drupal) was approaching 4 seconds from under 2 in May. Minimum page load speed had got to nearly 4 seconds for one site by September.
No Loss to Penguin
Google Penguin Had Zero Impact on My Sites
April 24, 2012 and Google hits out against over-optimised sites with its Penguin algorithm. Penguin penalised websites with over optimised anchor text in incoming links – backlinks in other words.
The Penguin algorithm slipped past my attention until I read a tweet from Matt Cutts. I’m nearly obsessive about watching traffic stats for the sites in my portfolio; they get checked daily for activity of all types – traffic, attacks, broken links and so on, and I hadn’t seen any unusual traffic dips on April 24 or shortly after. If anything traffic has increased to these websites since that time.
Cache Pre-load Impact on Performance
Cache Pre-load Improves Google Page Load
Using a cache pre-load system can improve Google crawl page load speed substantially as clearly shown in the infographic below. Google considers page load in it’s SERP algorithm as an indicator of site quality: Where two similar ranked sites exist, the site with faster load speed will usually get better SERP than a slower site. With this in mind surely it’s a good idea to make the effort to improve page load speed as much as possible.
Page load speed can be improved in a number of ways; moving the site to a better hosting service, optimising the site technically, including getting rid of unnecessary plugins, keeping image size as small as possible, and using an effective caching system are some of the things we can do.
No matter how well all the other technical aspects are improved, caching the site, and especially pre-loading the cache, will make a big difference to page load speed.
Latest Links to Your Site
Google Webmaster Tools Latest Links to Site
Latest Links to your site; something new from Google Webmaster Tools. It’s been a while since Google added anything new to their webmaster tools collection, but this new information tool is a “goody”. Latest Links to your site is a downloadable file, in CSV or Google Docs format, providing the site owner with a list of discovered backlinks, sorted by date and date-stamped.
Find Trackback Spam
The first practical use that comes to mind for Webmaster Tools Latest Links is to check for trackback spam. Trackback spam is a black-hat SEO technique to get you to publish trackbacks. The spammer sends a trackback to a post on your blog, hoping it will be published automatically, or approved if comments approval is required. But the black-hat doesn’t actually publish a link to the post – it’s a scam. And being friendly WordPress.com bloggers many of us see a trackback and think, “how nice, someone has given me a backlink, let me reciprocate”.
2753 Spam Comments in Two Weeks
The Heavily Spammed Article
Three spambots tried to leave 2753 spam comments on a single article in two weeks. I’m pleased to say none were succesful – all blocked by Drupal CAPTCHA. The article receiving this unwanted attention is about the use of website backlinks “Backlinks for Results“. I would take an educated guess at the subject matter of these spammers’ efforts – Black Hat SEO services!
That adds to the tally of around fifty other spam comments blocked most days of the week… I for one am very thankful for CAPTCHA challenges. These annoying, much hated image and text field challenges save a lot of time, and time is money…
Spambots are an evil of the net today, there’s no getting away from them, and the better a site performs in Google SERP, and the more visitors a site gets, the more spammers, both bots and human, will try to leave backlinks in rubbish comments hoping for that elusive “followed” backlink or just the traffic from readers clicks.
Fix this Message – Fake Warning
Mar 21
Posted by Mike
Another Spam Scam – Fix this Message
“If you are the owner of the site, you can fix this message by publishing…” is appearing all over blog comment forms. The spammer would have the blogger believe there is an error message somewhere on the site, and publishing the contents of the comment will some-how fix the supposed problem…
Mysteriously fix the Error Message
Of course this is a spammer trying to get the link to some trash site published, hoping to attract click-throughs to the site, hoping to sell some rubbish product like cheap black-market Viagra or install malware on the visitor’s computer, steal personal information such as your banking details. Are we really that naïve – I don’t think so.
Read the rest of this entry →
Posted in Scam, Spam
Leave a Comment
Tags: Blog, Scam, Security Risks, SEO, spam, Spam Comments