The Trouble with Ranking Factor Lists

Corey Northcutt    By under SEO.

Alphabet, Inc. chairman Eric Schmidt told us Google ranks sites based on 200 factors.

But then he told us that those signals were super secret.

Bing, in kind, said that they rank by 1,000 factors.

And Google responded saying that they meant 10,000.

All the while, SEO professionals have been dreaming up their own lists of 200 ranking factors.  Most include added bonuses to hit that high number: things read on other blogs, heard while drunk at an SES conference, or just seemed like a pretty good idea.

Not all of these lists are terrible, but for now nearly 15 years I've spent practicing SEO, I haven't seen a list that I love.  

This is why I'd like to announce our own contribution to the cluttered landscape of Google Ranking Factors. But with a twist: because it's time for all of us to start checking our facts.

The Concept

When I set out to build our list, I wanted to address two problems.

  1. Credibility
  2. Usability

Let's talk about each.

1. Credibility

What is a credible source for how Google ranks sites?

The answer isn't as simple as it seems.  A lot of methods have been tried and most just aren't good.  Because of this, the web is full of terrible, unsolicited SEO advice masquerading as fact.

The Good:

Science:   The scientific method is our ideal tool.  Unfortunately, many factors are impossible to test in a vacuum: especially those related to overall domain authority and quality.  Other factors though, like determining the relevance of content to particular searches, are much easier to set a hypothesis, experiment, and yield pure results.

Patents:  A huge hat tip is deserved to Bill Slawski of SEO by the Sea, who has been analyzing dense search engine patents for the rest of us since forever.  Patent filings don't prove that a ranking method has been implemented, but it does imply there was intent.  And after sifting through a few patents, it becomes clear that most Google filings have been implemented, at least for a time- verified using other methods.

Direct:  Occasionally, Google is completely forthcoming about how they rank sites.  Other times, it's necessary to read between the lines, like when we're told "doing X is a really good idea".  Many assume that after Google tells us something, it's law.  Unfortunately, that's not true either- our list unearths several contradictions that can usually be explained away by age of advice or differing interpretations of what another team within Google is doing.

The Bad:

Correlation:  Our own statistics-degree-wielding Cara Bowles summed up the challenges of correlation studies, where he responds to Moz's frequent correlation studies.  I love sifting through this data and hope that Moz does these forever, but these studies are usually misused.  Television ads are not a ranking factor, but the attention derived can create dozens of indirect ranking benefits or imply higher overall marketing spend.  

Assumptions:  It's easy to make assumptions about Google.  Often, we're left no other option but to play the "if I were Google" card.  This can be innocent, like relating how Google looks at a single piece of content as something that also gets calculated site-wide. We have evidence that a number of factors do work that way, which supports those assumptions.  It can also spiral out of control, to the point that people seem to think that every single action on Facebook or Twitter carries special benefits or that words stuffed into invisible code comments impact rankings.

The Ugly:

Surveys:  I believe there are more talented SEOs working now than ever.  But I've also found that most of the unsolicited advice floating around the web is wrong and often outright dangerous.  How long has nearly all of the Internet gone on thinking meta keywords mattered for rankings?  Once again, I love reviewing survey data and hope that those creating it keep at it, but if you compare a poll of "experts" to the full scope of what we know from verifiable science, patents, and direct statements, these results are typically pretty alarming.

Telephone Game:  If we were to create a pie chart of SEO advice in the wild, this would have to be the largest slice.  It starts when Matt Cutts says something.  An amateur SEO blogger that's never practiced professionally intentionally misinterprets it for clickbait (see: every piece of SEO content that begins "X is dead" ever).  The community then embraces their pageview success and thousands of even worse pieces propagate from there.

Hustlers:  Three facts behind our ranking factor resource are that 1.) Google Search is complex,  2.) Most of what Google does is really not a mystery, and 3.) Any long-term reputation between your site and Google must be earned, not "hacked".  Far too many SEO hustlers ignore these facts and general advice that's intentionally misleading.  If anyone ever tells you that they've "cracked Google's code" with uncited "Google secrets", run for the hills.

To solve this challenge while staying comprehensive, we acknowledge that all SEO knowledge fits somewhere on a sliding scale of bullshit.  All factors on our list are therefore given some succinct fact-checking, inspired by resources like Politifact.

Source citations are required for any factors rated better than "maybe".

For a factor to be rated as "concrete", we need at least two trustworthy sources (scientific verification, patent, or some direct acknowledgment from Google).  It scales all the way down to what we consider outright myths, for which widespread Internet chatter or disastrous previous work on our client sites has brought the concept to our attention.  Or, where a concrete source tells us an idea is false, with no contradicting evidence to imply otherwise.

2. Usability

Most other resources being enormous, sprawling blog posts, usability was also a major concern for this resource.  Filters have been added for all sorts of things.  First, the obvious:

Positive/Negative:  Some factors are helpful, while other factors are harmful.  This is not mutually exclusive. For example, we have direct and scientific evidence that extremely short content can harm your rankings.  That's not the same thing as longer content always being helpful; a factor supported only by correlation.

On-page/Off-page:   On-page means factors under your direct control as you manage your website.  Off-page is the opposite.  Simple as that.

From there, we get granular:  on-page SEO factors are subcategorized in a pretty self-explanatory way: sitewide, content, code, and server.

Sitewide:  Factors where your domain as a whole may have a score.

Content:  Rating of a single page of content.

Code:   Client-side rendering topics.  HTML, JavaScript, CSS, etc.

Server:   Server-side rendering topics.  HTTP response codes, addressing, performance, etc.

Off-page SEO topics are a little stickier.  Here's how we define those subcategories:

Authority:  Overall popularity of a site through mostly inbound links.  Especially related to nuances of the PageRank distribution, as well as other possible subtle signals like those on social media.

Relevance:   Factors that relate links, content, and sites to particular search queries.  Especially related to off-page word usage nuances of the Hilltop algorithm.

Quality:   Factors that determine a site is of adequate quality.  Especially related to off-page scoring of resources and experts, such as may be done with the Panda algorithm.

Circumstantial:  A variety of scenarios cause Google return different results.  These situations are factors to plan for on certain search queries, rather than attributes you can directly manipulate.

Patterns:  It's been made clear via patent and direct statement that Google's webspam team uses some patterns of unnatural activity to dish out devaluations.

Schemes:   Beyond unnatural patterns, Google may single out certain link schemes and penalizes them in severe circumstances where reducing/removing their value may not be enough.

Intervention:  Describes a limited set of situations where a human may interfere with Google's automatically generated results.

Between all of these filters, the goal is was not to have the largest list of ranking factors (although including both facts and myths has done so).

I've intentionally left off a few wackier factors (like bookmarks being given some unique treatment, or every possible third-party backlink metric like MozRank and Majestic).

Looking at Matt Cutt's quote from earlier about "up to 50 factors of factors", it becomes pretty easy to understand some stumbling points.  Reaching 500+ factors in this way would not have been difficult.   Algorithms are complex and nearly every line of code inside of a function could be treated as a signal.  It's obviously unreasonable to itemize them all without such a resource growing boring, repetitive, and obvious.

On the flip-side, a lot of lists aren't granular enough.  Most mention "PageRank" as a factor.  But then, they list factors that are actually functions of the PageRank algorithm, such as the authority of sites that link to them, number of inbound links, and the dilution of outbound link authority, and so on.  In cases where we look at one of Google's full, named algorithms, it's impossible to sum up functionality briefly.

Final Thoughts

I hope that our ranking factors resource helps a lot of people.

I hope it helps dispels some myths and prevents a lot of damage happening to good businesses.

To close with one last thought, as this resource is sure to be seen by a lot of other smart people, I'd like to put a call to contributors.  A happy side-effect of this approach is that it made clear how many factors are still in a state of "citation needed".

Simple scientific studies are still possible on a huge number of topics here that are lacking.  I intend to spend a lot of our time in the future developing more worthy sources for this document that aim to prove concepts true or false.

I'll continue to welcome any further sources or new sources that could fit this list.