Inside Machine Learning for SEO
Every day, many hopeful search engine optimisation (SEO) experts attempt to conquer the seemingly impossible: reroute the Google algorithm backwards to deliver those ever-important higher search rankings. Machine learning is that extra process push in the form of a structured, analytical tool that allows for greater understanding of SEO insight and an awareness of underlying search engine algorithms. As is the case with many structures, however, there is always a weak point. Here, we look at the strengths and weaknesses of Machine Learning Systems (MLS) in SEO.
Machine learning (ML) refers to how a software-based algorithm is made to automate the learning process behind something normally judged or decided by humans. ML enables learning and decision making processes to be scaled up, and the analysis of vast data sets with hundreds of underlying variables is possible. In order to make this possible, it is necessary to collect large data sets and analyse them statistically in one of two ways. These methods can be broadly separated into regression and classification.
Regression relates to the prediction, or forecasting of real outcomes; a hypothesis is generated given the output of a learning algorithm on a set of gathered training data. A typical use of this might be the prediction of how a search engine ranking of a site is likely to be affected given the increased or reduced inclusion of certain web site content.
Classification, on the other hand relates to an arbitrary assignment of elements within a data set to two or more defined types. For example the classification of the demographic of a web site user as a young, up and coming hipster or a tech-savvy grandmother.
Many SEO firms subscribe to the philosophy of “Why construct a team of engineers to track queries and check each algorithm when we can just write a program for it?” While the time gain is desirable, the technical process is vastly complicated.
At a basic level, a Machine Learning System ideally works as follows:
•The MLS will train itself, by comparing inputs variables to outputs on a finite training set comprising a large amount of data gathered from a web site. This “training” data may be labeled, or unlabeled (i.e. unstructured).
•The search engine begins to compile data, while simultaneously adjusting itself.
•Received feedback will then raise or lower the importance of certain parts of the system.
•Based on the hypothesis provided by its newly-trained learning algorithm the MLS will be able to make a future prediction, or classify gathered data elements into certain types.
Ghosts in the Machine
Much like any other machine, from the automobile to the iPhone, SEO machine learning systems are not perfect.
Mechanistic flaws with a web-based machine learning algorithm include:
• Unawareness of constantly changing algorithm factors.
• The addition of new variables over time such as social signals, visitor data, links, and bounce rate. However, this can be partially addressed by structuring the problem as “unsupervised”- i.e. with no pre-defined labels for the underlying factors.
• Search engine changes.
• Rapid formula changes.
• Multiple algorithms operating at different times in different parts of the world.
• Inability to monitor external influences (for example user fads, trends and global events, unrelated to the beahviour of search engine algorithms). These simply aren’t encapsulated in data gathered from a web page.
Simply put, machine learning is a good way to determine a more rigorously-educated guess of future site user behaviour, or to classify the users themselves into categories. However it will never reverse-engineer the Google algorithm nor provide a “be all, and end all” how-to guide on increased search rankings.
Depending on your point of view, this news is either a blessing or a curse. If you consider great search content to be the name of the game, then you win out.
Applying Machine Learning to SEO
For those with sites with rotating content, machine learning can be used effectively for the monitoring of search engine algorithms’ actions.
Documenting favouritism of certain behaviors over others is useful for enterprise sites such as newspapers or e-zines with discounts. Useful variables to monitor include:
•Aesthetic web design changes.
•Landing page content.
•Technology in use.
Machine learning for SEO is also effective during static search engine updates.
During these updates, your site (and those in the same query space) may have non spurious data that can be used to evaluate factors that have positive and negative impacts.
Google and Russian search engine Yandex are among the many currently using machine learning systems. It may not be perfect and the continuously changing algorithm factors adds new variables into the equation, but machine learning allows for a better idea of which content arrangement increases rankings.
1. Alchemy Viral – Link Research Tool Data: The Missing Analysis
2. How Search Engines Use Machine Learning for Pattern Detection
3. SEO-Theory.com – The Theory of Deep Web Interferometry
4. Alchemy Viral – Machine Learning Websites