Blog Archives

WordPress Database Junk

WordPress Database Contains a Lot of Junk

WordPress stores a lot of junk in the database. WordPress news feeds, theme and plugin update release notices and information, old plugin and theme stuff from removed plugins and themes that don’t clean out their data; and that’s before getting to the useful junk like post revisions and other data useful to WordPress users.

Storing junk in the database is nothing new to WordPress, as a .org support submission from 4 years ago shows.

I noticed my WordPress database was excessive in size, found it padded with 7,000 lines of WordPress “news.” Why is this stuff in my database, and how do I get it out? wordpress.org/support/topic/database-padded-with-junk-content

Interestingly, no-one bothered to reply to the submission.

Unnecessary Data Stored in Database

Read the rest of this entry

Bing Banned

Bing and MSN Bots Are Banned

I have banned Bing, Yahoo and MSN search engine spiders from my sites! I’m tired of the constant rule breaking and over-crawling by Bing and MSN search bots.

Bing is a Rule Breaker

Microsoft claims Bing honours robots.txt rules. In my experience that is a blatant lie. Bingbot / msnbot simply ignore robots.txt rules and crawl whatever they want. Some of the specific rules broken include;

  • crawling system folders
  • crawling image folders (msn-media bot). Image folders and extensions jpg,¬† png,gif, bmp are disallowed
  • crawling RSS feeds. All RSS feeds are disallowed; rss.xml, /feed/, etc
  • crawling comment forms; DOMAIN/comment/184 – the path /comment/ is disallowed in robots.txt

The last straw was today. 2 days ago I added Bing and MSN user agent strings to disallowed bots in robots.txt across all my sites; this morning I see these bots read robot.txt then ignored it totally, and crawled the sites anyway.

Have You Seen Bad Activity by Bing?

Read the rest of this entry

Website Down for Visitor Safety

Website Offline after DoS Attack

My Drupal website, graphicline.co.za, remains offline today following yesterdays JavaScript injection / denial of service attack. I decided to take the site offline to ensure the safety of visitors while I check the site for any malware. My hosting service technicians are also examining the server for any possible faults or configuration problems. Other sites on sub-domains of graphicline.co.za were affected at times, and further disruptions of service are expected.

website affline after dos attack graphic imageThe DoS (denial of service) attack began in the early hours of January 24 2012 and continued for nearly 2 hours. During this time thousands of attempts were made to inject JavaScript redirect code into the website (there are too many related entries in the log to count). Although initial inspection showed no successful hack, I felt it prudent to take the site down until certain no malware or other bad stuff had been included.

Read the rest of this entry

GNAX Hosting – Early Results

GNAX Hosting – So Far So Good

Last week I moved my domain graphicline.co.za to GNAX VPS hosting. I’ve watched Google page load times get shockingly poor the past four months. Nothing I’ve done on-site to improve performance has made any difference. I’d already tried several caching systems and offloaded some files to a CDN and other fast servers – with no improvement.

Eventually, after trying everything else, the only conclusion I could draw was the long path bottleneck between Google’s Mountain View servers and the data centre servers hosting my domain was the main culprit in the time it took for Big G to load pages.

Average page loads for 2 of the sites (WordPress) on the domain had gone from under 2.5 seconds in May to over 4 seconds in August and over 5 by September, while the main site (Drupal) was approaching 4 seconds from under 2 in May. Minimum page load speed had got to nearly 4 seconds for one site by September.

Read the rest of this entry

No Loss to Penguin

Google Penguin Had Zero Impact on My Sites

google penguin iconApril 24, 2012 and Google hits out against over-optimised sites with its Penguin algorithm. Penguin penalised websites with over optimised anchor text in incoming links – backlinks in other words.

The Penguin algorithm slipped past my attention until I read a tweet from Matt Cutts. I’m nearly obsessive about watching traffic stats for the sites in my portfolio; they get checked daily for activity of all types – traffic, attacks, broken links and so on, and I hadn’t seen any unusual traffic dips on April 24 or shortly after. If anything traffic¬† has increased to these websites since that time.

Read the rest of this entry

Server Resource Overload

My Domain is too Busy – Server Overloaded

server resource overload iconI never though I would be saying this – at least not within 12 months of switching my site(s) to CMS. It’s only 12 months – nearly twelve months, since I installed Drupal for my main website, less since adding two WordPress sub-domain sites. Now my the shared server is having to push out more than 80 000 pages a month – or so the server stats show. And it’s causing problems!

It’s causing problems because my domain is exhausting available server resources. The most common resource exceeding limits is CPU – often hitting 100% utilisation. Memory is often going over the available share of 1GB – that’s right 1000MB – available for the domain. While maximum entry processes are averaging between 4 and 6, there have been several instances where the 20 limit was exceeded.

And I’ve been wondering why I’ve been battling to edit and publish content for the past 6 weeks.

server resources used by graphicline.co.za graphic chart

Server Resources Usage Chart for graphicline.co.za

Read the rest of this entry

New Shop Front for Graphicline Shop

Break to Renew Shop Front Page

new shop front page iconThe new shop front page finally got started today, after being a low priority for several weeks. I decided to take a break from the tedious work of capturing products today to do something about the front page of the website, which was looking tacky after changing the layout width.

First off a new sliding banner got set up with 1080 pixel wide images, 100 px wider than the old slider. The banner is also lower in height by 80 pixels. The change of the catalogue from WPOnlineStore to CartPress resulted in an increase to the width of the site pages, and the old banner looked very untidy.

Read the rest of this entry

Pansee Site Valuation – Rubbish

False Valuation by Pansee.com

pansee site valuation trashEver had a site valuation by pansee.com – I got sent a mail informing me ‘someone’ had conducted a valuation of my website graphicline.co.za using pansee.com valuation tools, with a link to the valuation report. Interested to see what the report contained, I checked if Google had any information about malware on the site, then visited the page.

The valuation report had some interesting data. From the country where most of the website traffic is derived from, to number of daily visitors. And a claim to the value of advertising on the front page.

France is the Biggest Source of Traffic

This amused me… According to pansee.com, 12.2 percent of my traffic comes from France, while the USA only accounts for 8.1%

Read the rest of this entry

2753 Spam Comments in Two Weeks

The Heavily Spammed Article

spambot graphic imageThree spambots tried to leave 2753 spam comments on a single article in two weeks. I’m pleased to say none were succesful – all blocked by Drupal CAPTCHA. The article receiving this unwanted attention is about the use of website backlinks “Backlinks for Results“. I would take an educated guess at the subject matter of these spammers’ efforts – Black Hat SEO services!

That adds to the tally of around fifty other spam comments blocked most days of the week… I for one am very thankful for CAPTCHA challenges. These annoying, much hated image and text field challenges save a lot of time, and time is money…

Spambots are an evil of the net today, there’s no getting away from them, and the better a site performs in Google SERP, and the more visitors a site gets, the more spammers, both bots and human, will try to leave backlinks in rubbish comments hoping for that elusive “followed” backlink or just the traffic from readers clicks.

Read the rest of this entry

Apache Down

Our Website Server Has a Problem

500 errors enough already cartoonThe horrible feeling of clicking on a page, and the site reports a server error, and stays down… Since just after midday yesterday the Apache server hosting graphicline.co.za and all our sub sites has experienced problems. Major configuration changes were rolled out starting on Sunday April 16, which have severely disrupted the function of these sites…

Memory Allocation Reset to Default

First off all, memory allocations were reset to the server default value of 32MB – totally inadequate to run Drupal and WordPress. Then today I discovered sub-domains with static HTML files only were also throwing up a server error – so the sub-domains weren’t being seen as such by the server.

Read the rest of this entry