Quality guidelines – Part 2

Andreas Voniatis, 2012-11-20

Avoid the following techniques:

Here we go!

Automatically generated content

This is where automated software goes trawling the net for content based on the keywords you load into the software, jumbles it all up and then publishes it on web pages created on the fly.  Wait a minute, I’m sure this technique is used on a white hat machine learning program professing to make your website more semantic!  on the less flippant site, this technique usually involves a 302 redirect to the affiliate site where the url usually includes an affiliate tracking parameter.

Participating in link schemes

This would be public link networks (formerly known as link farms) where you upload your spintex content and then loads of spun content would be published on loads of websites stupid enough to publish it.  Usually these schemes run on a cooperative basis where you must have websites on your of own in order to be able to trade inventory on getting your links placed.  Some of these networks would let you buy credits, others not.  The value of the credits may also depend on the toolbar PageRank value or Domain Authority (DA) of the inventory.  Well there’s no honour amongst SEOs like there are with thieves  so with PageRank and money entering the link trading equation, people would often fake the value of their inventory by using iframes or sticking all their sites on the same server to game the link network system.  Regardless of such, the usual method for generating inventory would be to use scraper sites using auto generated content scripts stealing content from the net.  The more sophisticated sites had real content that is or was user generated.  However, this is less likely to be in vogue now that search engines place less emphasis if any to footer and other site wide links – given that they are likely to be paid for in cash or in kind.

Cloaking

Cloaking can come in many forms.  At it’s most sophisticated and extreme, IP addresses are harvested from a network of sites, and analysed for behaviour to help determine which of these are search engines, other robotos (say from price comparison sites, web services, SEO tool vendors or ecommerce trading platforms like Amazon/eBay) and real people.  This information would be used compile a database of spiders so that when the spider comes along it see a plain text html site with content (known as Spider food) and thinks nothing more of it other than being worthy of inclusion.  Human visitors on the other hand are redirected usually via a 302 to the merchant site via an affiliate link.  Personally I think the technology is incredible when you consider how much has gone into it and the advanced features involved.  However, it has no place for brands wanting to increase their reputation.

The other forms of cloaking may be via user agent, however that is fraught with problems for deception as new IP addresses for robots (not limited to search engines) get released all the time and don’t always declare their user agents so this would be a highly risky strategy.  You could also cloak by browser agent also.  But ask yourself – is it really all worth it?

Sneaky redirects

Sneaky redirects could be some page where after a few seconds a piece of Javascript will redirect you to usually a completely unrelated page trying to sell you something.  In more cases than not this is likely to be an affiliate program with the tracking ID in the destination redirect URL.

Hidden text or links

SEOs for some reason want to hide their links from users but not from search engines – not exactly a metric of trust is it?  Other tricks in days gone by were hiding links way below the footer.  In defence of SEO, this was used to maintain the user flow of the site whilst sculpting PageRank around the website as appropriate and assign keyword values to the corresponding pages.  No need, search engines are much better at understanding content on pages.  A method for detection via algorithms could be the distance in hexadecimal values that denote colour between the font and the background.

Doorway pages

We’ve all seen them but we may not know the name for them.  I’ll break the suspense, I’m referring to sites that have search results pages as landing pages or more commonly dedicated landing pages to different regions or towns.  You’ll often see this in many industries including double glazing where a website will have pages linked to from the home page as follows:

conservatories london | conservatories surrey | conservatories hampshire | conservatories manchester etc etc

You get the idea.  The content will barely make sense and is clearly designed to rank for local searches on a company’s business offering even though they have local offices or presence in that area.   The more sophisticated though still obvious tactic is to write some blurb about the local area and then lead onto what is you’re really selling.  Again very amateur and much better ways to deal with it if you care to get in touch.

Scraped content

At it’s most basic, this is a script that goes trawling the net using keyword inputs to find the content you want, but don’t want to pay for.  The script will then mix it up using a number of techniques to make it unique by substituting words, injecting symbols or using a Markov chain.  It was common practice for affiliates to steal content in this manner and publish it on pages to create large websites at little cost.  Nowadays, content curation is all the rage for automated scalable content spam which is a problem Google will have to deal with in 2013 or sooner.

Participating in affiliate programs without adding sufficient value

Essentially creating a site with content no one cares about even if it is unique.  The likelihood is the content is written with very little authority or original research to make it stand out from the thousands of affiliate sites that get promoted and launched each day.  Although this wouldn’t necessarily incur a penalty for launching a meaningless site, it certainly wouldn’t win any friends either.  You’d be far better thinking and executing on a unique angle nobody else has come up with that would generate interest.

Loading pages with irrelevant keywords

Usually as a result of using a content scraper script to create the illusion that the site pages have readable content.  Still Panda has put paid to pages filled with keywords but no real added value.

Creating pages with malicious behavior, such as phishing or installing viruses, trojans, or other badware

It makes sense for search engines to dump malicious sites like a hot potato even if the intent is non malicious.  What do I mean?  Firm up on your internet server security so that your website doesn’t get hacked and unleashes hell on your visitors.

Abusing rich snippets markup

A new one since late 2012 – This is where people stick all kinds of rich snippets using Schema.org mark up to score points that are not warranted.  An example might be UGC ratings for web pages when it’s impossible for users to submit ratings.

Sending automated queries to Google

If your running auto posting programs to influence Google Suggest – this could be what the guideline is alluding to and is one of the tools used by some in search reputation management.

Engage in good practices like the following:

At last some positive guidelines…

Monitoring your site for hacking and removing hacked content as soon as it appears

As mentioned above.  You can even 404 urls if the site has received malicious links.

Preventing and removing user-generated spam on your site

Clean up your comments and actually moderate any blogs and forums you operate on your site.

If you determine that your site doesn’t meet these guidelines, you can modify your site so that it does and then submit your site for reconsideration.

That means putting together a serious document detailing what it is you did, what you have done in good faith to put it right and how you intend to go about doing SEO going forward.

Previous post:

Next post: