Blog Archives

Rulebreaker Bingbot and MSN

Bingbot and MSN bots are Rulebreakers

I’m tired of watching Bingbot and msnbot breaking rules and crawling disallowed files, folders and paths. Microsoft say their bots obey robots.txt rules – They don’t. Bingbot/msnbot faithfully reads robots.txt, then immediately afterwards continues on to crawl items specifically listed…

1st Rule Breaker Example – Comments

image of bing and msn bot logoComment paths are disallowed:

  • Disallow: /comment/
  • Disallow: /*/comment/
  • Disallow: /comment/reply/

And the result, Bing crawls these paths

  • 157.56.93.219   /comment/169   2004/07/13  14:24
  • 157.56.93.219   /comment/179   2004/07/13  14:25
  • 157.56.93.219   /comment/201   2004/07/13  14:26

Read the rest of this entry

Auto Hyperlinks

Now we Get Auto Hyperlinks – Bad News

wordpress.com automatic hyperlinks thumbnail imageText gets turned into hyperlinks automatically. I just discovered this annoying thing that’s part of the latest version of WordPress used by WordPress.com – WordPress 3.5. Type the text for a URL and the darn thing turns into a hyperlink when published. That’s right, you don’t have to click on the link function in the editor, so no options to add target info and title… No options not to create the hyperlink…  Arrgghhh!

Maybe it’s handy for the terminally lazy, but it’s bad news for SEO. And what about the bloggers who write about malware and bad websites, and want to tell readers about these bad addresses? They don’t want visitors to click a hyperlink, just want to inform people about the bad address. With auto-hyperlinks the information becomes an active link!

For example, this hacker information “Exploit attempt on WordPress GD Star Rating plugin”

Read the rest of this entry

Hosting Change – Ten Times Faster

Website Loads 10 Times Faster After Hosting Change

One of my sub-sites loads 10 times faster after moving the domain to an offshore server. To be totally fair and put the improvement in perspective, the actual server is not that much faster; the big difference is route latency or lag.

graph showing 10 times faster page load speed

Before moving the average time it took Google-bot to load a page from this site was around 1100 ms. Now, a month later we can see the improvement – average time is about 100 ms..

Read the rest of this entry

GNAX Hosting – Early Results

GNAX Hosting – So Far So Good

Last week I moved my domain graphicline.co.za to GNAX VPS hosting. I’ve watched Google page load times get shockingly poor the past four months. Nothing I’ve done on-site to improve performance has made any difference. I’d already tried several caching systems and offloaded some files to a CDN and other fast servers – with no improvement.

Eventually, after trying everything else, the only conclusion I could draw was the long path bottleneck between Google’s Mountain View servers and the data centre servers hosting my domain was the main culprit in the time it took for Big G to load pages.

Average page loads for 2 of the sites (WordPress) on the domain had gone from under 2.5 seconds in May to over 4 seconds in August and over 5 by September, while the main site (Drupal) was approaching 4 seconds from under 2 in May. Minimum page load speed had got to nearly 4 seconds for one site by September.

Read the rest of this entry

No Loss to Penguin

Google Penguin Had Zero Impact on My Sites

google penguin iconApril 24, 2012 and Google hits out against over-optimised sites with its Penguin algorithm. Penguin penalised websites with over optimised anchor text in incoming links – backlinks in other words.

The Penguin algorithm slipped past my attention until I read a tweet from Matt Cutts. I’m nearly obsessive about watching traffic stats for the sites in my portfolio; they get checked daily for activity of all types – traffic, attacks, broken links and so on, and I hadn’t seen any unusual traffic dips on April 24 or shortly after. If anything traffic  has increased to these websites since that time.

Read the rest of this entry

Fix Widget Position

CSS Fix for Text Widget Without Title

I previously wrote about using a Text Widget without a Widget Title or Name as a way to gain a small improvement in SEO for a WordPress.com blog (see WordPress Widget Headings) by using a lower order <h> attribute than the <h3> standard mark-up used by the theme.

Some themes play nicely when the title is blank by collapsing the space used by the title if it is missing. More themes however keep the space intended for the title resulting in a large blank space that spoils the appearance of the blog. This can be fixed fairly easily using CSS styles to reposition the widget content.

Read the rest of this entry

Cache Pre-load Impact on Performance

Cache Pre-load Improves Google Page Load

Using a cache pre-load system can improve Google crawl page load speed substantially as clearly shown in the infographic below. Google considers page load in it’s SERP algorithm as an indicator of site quality: Where two similar ranked sites exist, the site with faster load speed will usually get better SERP than a slower site. With this in mind surely it’s a good idea to make the effort to improve page load speed as much as possible.

Page load speed can be improved in a number of ways;  moving the site to a better hosting service, optimising the site technically, including getting rid of unnecessary plugins, keeping image size as small as possible, and using an effective caching system are some of the things we can do.

No matter how well all the other technical aspects are improved, caching the site, and especially pre-loading the cache, will make a big difference to page load speed.

Read the rest of this entry

Latest Links to Your Site

Google Webmaster Tools Latest Links to Site

Google Webmaster Tools Latest Links to Site iconLatest Links to your site; something new from Google Webmaster Tools. It’s been a while since Google added anything new to their webmaster tools collection, but this new information tool is a “goody”. Latest Links to your site is a downloadable file, in CSV or Google Docs format, providing the site owner with a list of discovered backlinks, sorted by date and date-stamped.

Find Trackback Spam

The first practical use that comes to mind for Webmaster Tools Latest Links is to check for trackback spam. Trackback spam is a black-hat SEO technique to get you to publish trackbacks. The spammer sends a trackback to a post on your blog, hoping it will be published automatically, or approved if comments approval is required. But the black-hat doesn’t actually publish a link to the post – it’s a scam. And being friendly WordPress.com bloggers many of us see a trackback and think, “how nice, someone has given me a backlink, let me reciprocate”.

Read the rest of this entry

2753 Spam Comments in Two Weeks

The Heavily Spammed Article

spambot graphic imageThree spambots tried to leave 2753 spam comments on a single article in two weeks. I’m pleased to say none were succesful – all blocked by Drupal CAPTCHA. The article receiving this unwanted attention is about the use of website backlinks “Backlinks for Results“. I would take an educated guess at the subject matter of these spammers’ efforts – Black Hat SEO services!

That adds to the tally of around fifty other spam comments blocked most days of the week… I for one am very thankful for CAPTCHA challenges. These annoying, much hated image and text field challenges save a lot of time, and time is money…

Spambots are an evil of the net today, there’s no getting away from them, and the better a site performs in Google SERP, and the more visitors a site gets, the more spammers, both bots and human, will try to leave backlinks in rubbish comments hoping for that elusive “followed” backlink or just the traffic from readers clicks.

Read the rest of this entry

Fix this Message – Fake Warning

Another Spam Scam – Fix this Message

“If you are the owner of the site, you can fix this message by publishing…” is appearing all over blog comment forms. The spammer would have the blogger believe there is an error message somewhere on the site, and publishing the contents of the comment will some-how fix the supposed problem…

Mysteriously fix the Error Message

fix it button graphicPublish the comment and the problem with the site is gone! Wow – as easy as that. No checking code files or testing plugins, all your problems are solved if you are the owner of the site…  Publish the comment and you can fix this message. So simple.

Of course this is a spammer trying to get the link to some trash site published, hoping to attract click-throughs to the site, hoping to sell some rubbish product like cheap black-market Viagra or install malware on the visitor’s computer, steal personal information such as your banking details. Are we really that naïve – I don’t think so.

Read the rest of this entry