There's a lot of information floating around these days about what makes a site rank. Most of it's misinformed, outdated, or entirely untrue, because practically nobody cites their sources. Worst of all, the products/services that stem from that kind of recklessness are often dangerous.

Despite all that, we actually know a lot for certain about the way Google ranks sites. Real SEO knowledge doesn't come from a random blogger, forum, or get-rich-quick scheme. The best information comes from three sources.

  1. Patent filings
  2. Direct statements from Google and/or their team
  3. Applying The Scientific Method

This resource is a complete guide to how Google ranks sites. We've included factors that are controversial or even outright myths, but created filters to hide the junk. The information below is updated constantly, so if you're serious about SEO, we recommend signing up for quarterly updates (below) so that you don't lose touch.




Filters:

Showing of Factors.

Myth

Concrete



Positive On-Page Factors

On-page SEO describes factors that you are able to manipulate directly through the management of your own website. Positive factors are those which help you to rank better. Many of these factors may also be abused, to the point that they become negative factors. We will cover negative ranking factors later in this resource.

In broad terms, positive on-page ranking factors relate to establishing the subject matter of content, accessibility across various environments, and a positive user experience.

Positive On-Page Factors

Keyword in URL

Keywords and phrases that appear in the page URL, outside of the domain name, aid in establishing relevance of a piece of content for a particular search query. Diminishing returns are apparently achieved as URLs become lengthier or as keywords are used more than once.

Source(s): Patent US 8489560 B1, Matt Cutts

Keywords Earlier in URL

The order in which keywords appear in a URL matters. It's been theorized that keywords appearing earlier in a URL have more weight. At minimum, it's been confirmed by Matt Cutts that "after about five" words, the weight of a keyword dwindles.

Source(s): Matt Cutts

Keyword in Title Tag

Title tags define the title of a document or page on your site, and often appear in both the SERP and as snippets for social sharing. Should be no longer than 60-70 characters, depending on the characters (Moz Tool). As with URL, keywords closer to the beginning are widely theorized to have more weight.

Source(s): US 20070022110 A1

Keyword Density of Page

The percentage of times a keyword appears in text. Practicing SEOs once sculpted all content so that a single keyword/phrase appeared 5.5%-6% of the time. In the early-to-mid-2000s, this was very effective. Google has since improved with other types of content analysis that those tactics are scarcely relevant in 2015. And Keyword Density, although referenced in Google Patents, is almost certainly just a simplified concept within TF-IDF, which we'll cover next.

Source(s): Patent US 20040083127 A1

TF-IDF of Page

Think of TF-IDF, or Term Frequency-Inverse Document Frequency, like Keyword Density with context. TF-IDF weighs the density of keywords on a page against what is "normal" rather than just seeking out a flat, raw percentage. This serves to ignore words like "the" in computation and establishes how many times a literate human should probably mention a phrase like "Google Ranking Factors" in a single document that covers such a topic.

Source(s): Dan Gillick and Dave Orr, Patent US 7996379 B1

Key Phrase in Heading Tag (H1, H2, etc.)

Keywords in Heading tags have strong weight in determining the relevant subject matter of a page. An H1 tag carries the most weight, H2 has less, and so forth. This tag also improves accessibility for screen readers and clear, descriptive headings reduce bounce rates according to various studies.

Source(s): In The Plex, Penn State

Words with Noticeable Formatting

Keywords in bold, italic, underline, or larger fonts have more weight in determining the relevant subject matter of a page, but less weight than words appearing in a heading. This is confirmed by Matt Cutts, SEOs, and a patent that states: "matches in text that is of larger font or bolded or italicized may be weighted more than matches in normal text."

Source(s): Matt Cutts, Patent US 8818982 B1

Keywords in Close Proximity

The closeness of words to one another implies association. To anyone that's ever wielded the English language, this won't come as a surprise. One paragraph about your SEO work in Chicago will thus do more to rank for "Chicago SEO" than two paragraphs, with one about SEO and one about Chicago.

Source(s): Patents: US 20020143758 A1, US 20080313202 A1

Keyword in ALT Text

The ALT attribute of an image is an used to describe that image to search engines and who are unable to display the image. This establishes relevance, especially for Image Search, while also improving accessibility.

Source(s): Matt Cutts

Exact Search Phrase Match

Although Google may return search results that contain only part of a search phrase as it appears on your page (or in some cases, none at all), a patent states that a higher Information Retrieval (IR) score is given for an exact match. Specifically, stating that "a document matching all of the terms of the search query may receive a higher score than a document matching one of the terms."

Source(s): Patent US8818982 B1

Partial Search Phrase Match

It's established by a Google patent that when a page contains an exact match of a search phrase on the page, it significantly perceived to that query relevance, dubbed the Information Retrieval (IR) score. In the process, they confirm that you may still rank for certain search queries when a page contains a search phrase not exactly as it was entered into Google. This is further verified by just doing a lot of Googling.

Source(s): Patent US8818982 B1

Keywords Higher on Page

There's a natural trend in how we write English: earlier is usually more important. This applies to sentences, paragraphs, pages, HTML tags. Google seems to apply this everywhere as well, with content that appears earlier and more visibly being given more weight. This is, at very least, a function of the Page Layout algorithm, which gives a lot of preference to what appears above-the-fold on your site.

Source(s): Matt Cutts

Keyword Stemming

Keyword stemming is the practice of taking the root or 'stem' of a word and finding other words that share that stem (ie. 'stem-ming', 'stem-med', etc.). Avoiding this, such as for the sake of a keyword density score, results in poor readability and has a negative impact. This was introduced in 2003 with the Florida update.

Source(s): Matt Cutts

Keyword is Domain Name

Also referred to as an Exact Match Domain or EMD. A powerful ranking bonus is attributed when a keyword exactly matches a domain and a search query meets Google's definition of a "commercial query". This was designed so that brands would rank for their own names, but was frequently exploited and as a result, made less-powerful in various circumstances.

Source(s): Patent EP 1661018 A2, US 8046350 B1

Keyword in Domain Name

A ranking bonus is attributed when a keyword or phrase exists within a domain name. The weight given seems to be less significant than when the domain name exactly matches that of a particular SEO query, but more significant than when a keyword appears later in the URL.

Source(s): Patent EP 1661018 A2

Keyword Density across Domain

Krishna Bharat identified a problem with PageRank when he introduced Hilltop: "a web-site that is authoritative in general may contain a page that matches a certain query but is not an authority on the topic of the query". Hilltop improved search by looking at the relevance of entire sites, labeled "experts". Since TF-IDF determines page-level relevance, we make a small assumption that Hilltop defines an "expert" domain using the same tools.

Source(s): Krishna Bharat, Patent US 7996379 B1

TF-IDF across Domain

Saying "Keyword Density" instead of "Term Frequency" in 2015 throws a lot of SEO specialists into a rage, despite being perfect synonyms. What's important when talking about "Keyword Density" factors is again the latter half of TF-IDF: Inverse Document Frequency. Google throws out words like adverbs with TF-IDF and dynamically evaluates the natural density for topic. Comparative metrics on "how much is natural" have apparently decreased over time.

Source(s): Dan Gillick and Dave Orr, Patent US 7996379 B1

Distribution of Page Authority

Typically, pages that are linked sitewide are given a large boost, pages linked from them get a lesser boost, and so forth. A similar effect is often seen from pages linked from the homepage, because this is commonly the most-linked page on most websites. Creating a site architecture to maximize this factor is commonly known as PageRank Sculpting.

Source(s): Patent US 6285999 B1

Old Domain

This is somewhat confusing since a brand new domain name may also receives a temporary boost. Older domains are given a little more trust, which Matt Cutts emphasizes is pretty minor (while in the process, acknowledging exists). Speculatively, this may be rewarding sites that have had a chance to prove themselves not a part of short-term black hat projects.

Source(s): Matt Cutts

New Domain

New domains may receive a temporary boost in rankings. In a patent discussing methods of determining fresh content, it's stated "the date that a domain with which a document is registered may be used as an indication of the inception date of the document." That said, the impact this actually has on one's rankings is, according to Matt Cutts, relatively small. Speculatively, this may be intended to give a brand new site, or timely niche site, just enough chance to get off the ground.

Source(s): Patent US 7346839 B2, Matt Cutts

Hyphen-Separated URL Words

The ideal method of separating keywords in a URL is to use a hyphen. Underscores can work, but are not as reliable, as they can be confused with programming variables. Mashing words together in a URL is likely to cause words to not be seen as separate keywords, thus preventing any Keyword in URL bonus. Aside from these scenarios, just using a hyphen will not make a site rank higher.

Source(s): Matt Cutts

Keywords Earlier in Tag

An SEO theory manifested itself in the early 2000s called the first third rule. It noted that our language - sentences, titles, paragraphs, even entire web pages, are generally used in order of importance. Although not confirmed by Google, Northcutt's experience with word order experiments have more frequently indicated that this is a factor.

Source(s): Speculative

Long Domain Registration Term

Google directly states in this patent that longer domain registration terms predict the legitimacy of a domain. Speculatively, those that engage in webspam understand that it's a short-term, high volume game of burn/rinse/repeat and don't purchase domains for longer than they need.

Source(s): Patent US 7346839 B2

Public Whois

Despite Google downplaying their ability to investigate Domain Registrant information, we know of a patent that discusses using Domain Registration Terms to single out webspam schemes. We've also seen Matt Cutts speak about private whois contributing to penalties, and encouraging visitors on his blog to report fake whois data. We believe that this is wise "play it safe card", despite only a lack thereof being confirmed as a (negative) factor.

Source(s): Patent US 7346839 B2, Matt Cutts

Use of HTTPS (SSL)

SSL was officially announced as a new positive ranking factor in 2014, regardless of whether the site processed user input. Gary Illyes downplayed the significance of SSL in 2015, calling it a tiebreaker. Although, for an algorithm based on the numeric scoring of billions of web pages, we've found that tiebreakers very often make all of the difference on competitive search queries.

Source(s): Google, Gary Illyes

Schema.org

With the advent of Schema.org, a joint project between Google, Yahoo!, Bing, and Yandex to understand logical data entities over keywords, we move further away from the traditional "10 blue links" style of search. Currently, use of Structured Data can improve rankings in a massive variety of scenarios. There are also theories that schema.org can improve traditional search rankings by catering to a ranking method known as entity salience.

Source(s): Schema.org, Matt Cutts

Fresh Content

The full name of this one is technically "fresh content when query deserves freshness". This term, Query Deserves Freshness (often shortened to QDF), refers to search queries that would benefit from more current content. This does not apply to every query, but it applies to quite a lot, especially those that are informational in nature. These SEO benefits are just one more reason that brand publishers tend to be very successful.

Source(s): Matt Cutts

Domain-wide Fresh Content

There is unconfirmed speculation that domain-wide performance is improved by maintaining fresh content. Speculatively, this means that overall the resource that Google is recommending is less "stale" and more accurate/relevant, especially if at leastsome significant portion of the information has been worth a little upkeep or supplementation by the owner.

Source(s): Patent US 8549014 B2, Speculation

Old Content

A Google patent states: "For some queries, older documents may be more favorable than newer ones." It goes on to describe a scenario where a search result set may be re-ranked by the average age of documents in the retrieved results before being displayed.

Source(s): Patent US 8549014 B2

Domain-wide Old Content

Theoretically, for all we have heard about Query Deserves Freshness (QDF), which serves news-like content in a number of circumstances, some sort of "Query Deserves Oldness". Considering that we've never been told about "QDO" by Google, it may be reasonable to conclude that older content is always preferred when QDF is not at play. Just like domain-wide freshness, however, we don't have too much evidence to confirm a domain-wide seniority score.

Source(s): Speculation

Good Spelling and Grammar

This is a Bing ranking factor. Amit Singhal stated "these are the kinds of questions we ask" regarding spelling/grammar in Google's definition of quality content. Matt Cutts said no in 2011 as of "a long time ago", but also that rankings correlate anyway. Our agency's findings have been that the first Panda update made this seem to matter a lot. Regardless, many content-related factors here are clearly affected by spelling/grammar.

Source(s): Amit Singhal

Reading Level

We know that Google analyzes the reading level of content, since they created such a search filter for the results page (now removed). We also know that content mills, which Google is not fond of, are considered to be very basic, whereas academic writing was very advanced. What we don't have, as of yet, is a concrete source or study that directly relates reading level to rankings.

Source(s): Correlation Study, Speculation

Rich Media

Rich media, on top of drawing more traffic from in-line image and video search, has long been considered a component of "high quality, unique content". Video appeared to be the deciding factor with Panda 2.5. Northcutt's work has also shown a positive correlation. Currently though, there's no official, public source signing off on this factor.

Source(s): SEL on Panda 2.5

Subdirectories

Categorical Information Architecture has been an SEO discussion point for a long time, as it seems that Google analyzes topic coverage across entire sites. The exact ranking implications of this are unclear, but Google now refers to this as Structured Data, and at very least, will use to display breadcrumbs on the results page, therefore ranking more pages.

Source(s): Google Developers

Meta Keywords

Some SEOs claim that the meta keywords tag never mattered for SEO. That's a myth. The notion that Google ranks meta keywords in 2015 is also a myth. Both of these facts were confirmed the same way - by placing a zero-competition, made-up word in a meta keywords tag, getting that page into the index, then searching that word. Remember though, that Google is not the only search engine, and could theoretically index countless other dynamic sites that benefit from this tag.

Source(s): Matt Cutts, Experiment Page

Mobile Friendliness

Mobile-friendly websites are given a significant ranking advantage. For now, the ranking implications of this appears to pertain only to users searching on mobile devices. This made its way into the mainstream SEO conversation and became more severe during the Mobilegeddon update in 2015, although experts were speculating on this topic for nearly a decade previous.

Source(s): Various Studies

Meta Description

A good meta description functions as a search ad. Considering how many AdWords agencies exist almost entirely on A/B testing AdWords ads, the marketing value here can't be understated. Although keywords used in meta descriptions were once widely considered a direct ranking factor, Matt Cutts stated in 2009 that they're not now.

Source(s): Matt Cutts

Google Analytics

Many have suggested that Google Analytics is or may become a Google ranking factor. All evidence at present, as well as very clear statements from Matt Cutts, indicate that any ranking benefits coming from Google Analytics, now or ever in the future, are an absolute myth. That said, it's an amazingly powerful tool in the right marketer's hands.

Source(s): Matt Cutts

Google Search Console

Just like Google Analytics, there are no confirmed ranking benefits to using Google Search Console (formerly known as Webmaster Tools) in any way. Search Console is still useful in unearthing problems related other ranking factors on this page; especially those related to manual penalties and certain crawler errors.

Source(s): Speculation

ccTLD in National Ranking

Country code TLDs such as .uk and .br are believed to carry with them a ranking bonus to searches from the same country, which is especially useful for internationalization. They should also perform far better in contrast to a ccTLD from another country.

Source(s): Speculation

XML Sitemaps

Sitemaps can be useful, though not required, for the purpose of getting more pages of your site into the Google index. The notion that an XML sitemap will improve rankings within Google is a myth. This comes straight from Google and is confirmed by various studies.

Source(s): Susan Moskwa & Trevor Foucher

Salience of Entities

As time goes on, Google seems to do more to analyze ideas and logical entities in preference to words and phrases. It analyzes how we say things in preference to exact search queries that appear on a page. This process, in simple terms, is what's making it possible to search for "how to cook meat", and be returned results for steak recipes that might not mention the word "meat" directly anywhere.

Source(s): Dan Gillick & Dave Orr

Phrasing and Context

As keyword density is now virtually a non-factor, a basic understanding of Phrase-Based Indexing tells us that if you write about content thoroughly and elaborately, you stand a far better chance of ranking compared to writing generic content that just happens to drop a lot of keywords. A clear component of one Google patent describes this as the "identification of related phrases and clusters of related phrases".

Source(s): Patent US 7536408 B2

Web Server Near Users

Google functions differently on many local queries, supplementing traditional results with Google Maps results, and potentially altered organic listings as well. The same is true for national and international searches. By hosting your site at least loosely near to your users, such as within the same country, you are likely to enjoy better rankings.

Source(s): Matt Cutts

Author Reputation

Authorship was an experiment that Google ran from 2011 to 2014, which thrived upon bloggers using the rel="author" tag to establish the reputation of particular authors. Google directly confirmed by the creation and demise of authorship. Eric Enge dida nice eulogy on the rise and fall of authorship on Search Engine Land.

Source(s): John Mueller

Using rel="canonical"

The rel="canonical" tag suggests the ideal URL for a page. This can avert duplicate content devaluations and penalties when multiple URLs might result in the same content. Our experience is that this is only a suggestion to Google and one that is often ignored. According to Google it does not directly improve rankings. Despite all of this, it's a very good idea.

Source(s): Google

Using rel="author"

Using rel="author" was once widespread SEO advice and hypothesized as a positive ranking factor, but Google's use of this factor at all went away along with an entire practice known as Authorship. The notion that rel="author" is beneficial for any reason whatsoever is now regarded as a myth.

Source(s): John Mueller

Using rel="publisher"

Just like rel="author", using rel="publisher" was once widespread SEO advice and also hypothesized as a positive ranking factor. And, just like rel="author" Google's use of rel="publisher" at all went away along with an entire practice known as Authorship.

Source(s): John Mueller

URL uses "www" Subdomain

A common misconception propagated by SEO bloggers suggests that a site may rank better if your URLs start with "www". This originates from the idea that we often force all pages on a site to resolve at "www". The reason that we actually do this is simply to avoid two URLs serving the same content at the same address, which would bring about a negative ranking factor.

Source(s): Speculation

Dedicated IP Address

Web server IP addresses can be useful for geo-targeting certain demographics. They can be negative ranking factors when they sit amidst a significant private webspam operation, or are used by the Hilltop algorithm to identify two sites as being from differing owners. But, the notion that just having a dedicated IP address provides a direct ranking advantage has been repeatedly debunked.

Source(s): Matt Cutts

Subdomain Usage

Subdomains (thing.yoursite.com) are often viewed as separate websites by Google, as compared to subfolders (yoursite.com/thing/), which are not. This has obvious implications with many other factors on this page. Matt Cutts called subfolders/subdomains "roughly equivalent" in 2012, confirming this now happens less often, but still happens. Panda recovery stories post-2012 such as HubPages migration from subfolders/subdomains, prove that it still can be a major factor.

Source(s): Matt Cutts

Number of Subdomains

The number of subdomains on a site appears to be the most significant factor in determining whether subdomains are each treated as their own sites (as occurs in nature with free web hosting services and hybrid hosting/social sites like HubPages), or just portions of a common site. Presumably, thousands of subdomains means that they don't all belong to a single thematic site and are likely each websites in their own right.

Source(s): Speculation

Use AdSense

Although SEO paranoia seems to make this frequent advice, it's directly denied by Google. We've also found no real evidence to support, and have seen no noticeable effects when assisting with optimizations for media monetization, which is something that our agency frequently does. We're therefore prepared to firmly declare this factor a myth.

Source(s): Matt Cutts

Keywords in HTML Comments

This is an early SEO theory that's very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn't.

Source(s): Experiment Page

Keywords in CSS/JavaScript Comments

Another twist on an early SEO theory that's very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn't.

Source(s): Experiment Page

Keywords in CLASSes, NAMEs, and IDs

Once again, we can debunk theories as to whether or not words in an odd place have any impact on search engines by putting a non-competitive phrase there and waiting. It's not worth even speculating at what Google tells us or what's in a patent. And again here, we can confirm that this factor is a myth, at least at the time of writing this.

Source(s): Experiment Page

Privacy Policy Usage

A single experience was posted on Webmaster World in 2012 which sprawled into a larger discussion: does having a Privacy Policy benefit rankings? For what it's worth, 30% of Search Engine Roundtable-ers voted yes, and it does fit Google's stated philosophies pretty well. Still, this is very theoretical.

Source(s): SER Discussion

Verifiable Address

A physical address is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to "NAP") together. "Highly satisfying contact information" is also something that Google quality control auditors are instructed to seek out.

Source(s): Search Engine Land

Verifiable Phone Number

A phone number is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to "NAP") together. "Highly satisfying contact information" is also something that Google quality control auditors are instructed to seek out.

Source(s): Search Engine Land

Accessible Contact Page

Theorized as a mark of legitimacy. It appears that this may have originated, or is at least best-supported, from a document called Google's Quality Rater Guidelines. In this document, Google asks quality control auditors to search for "highly satisfying contact information."

Source(s): Search Engine Land

Low Code-to-Content Ratio

This SEO theory seemed to become widespread in 2011, suggesting that more content and less code is good. Here's what we know: 1.) Speed is a confirmed factor, 2.) Google's own PageSpeed Insights tool really presses even a 5Kb reduction in payload size, 3.) Certain, subtle code mistakes can cause devaluations and penalties. So at minimum (and more likely) this is an issue of indirect correlation.

Source(s): SitePoint Post, SEOChat Tool

Meta Source Tag

The Meta Source Tag was created for Google News in 2010 to better-attribute sources. It comes in two forms: syndication-source (if syndicating a 3rd party) and original-source (you're the source). In situations where content is syndicated, this may theoretically help avoid duplicate content penalties. If you're the original-source, this tag is overridden by rel="canonical" anyway.

Source(s): Eric Weigle

More Content Per Page

SerpIQ conducted an interesting correlation study comparing the length of content to top rankings, which decidedly favors content with 2,000-2,500 words. It's not clear if this is an indirect function of other factors, such as these pages being better-liked and therefore drawing more links/shares, or growing popular by ranking for more, longer search query variations.

Source(s): SerpIQ

Meta Geo Tag

Unlike IP address and ccTLDs, Matt Cutts states that they "barely look at this tag, if at all", although he did suggest that this tag might be considered if you were to use it on a gTLD site (such as ".com"), and attempt to restrict it to a country. So, while this is confirmed to be almost useless, it was suggested that Google does at least look at it and may consider it a factor very, very rarely for internationalization.

Source(s): Matt Cutts

Keywords Earlier in Display Title

More than a decade of studies and correlation research suggests that titles that begin with a keyword usually (but not always) rank better than titles ending in a keyword. It's easy to test and usually confirms: earlier keywords are better. But our chosen source for this suggests more. Thumback.com conducted a study where title word order changed traffic by 20%-30%. Their best-performing titles didn't begin with a keyword, but were altered (as Google sometimes does) to do so in Google's results.

Source(s): Thumbtack Study

Keywords Earlier in Headings

Heading tags are another place where word order appears to really matter. Again, something known as the "first third rule" has been often thrown around on this topic - suggesting that words appearing earlier have more weight. Usually our findings have confirmed this, but regardless, it's well-worth testing, especially in the H1 position.

Source(s): Speculation

Novel Content against Web

A patent suggests that Google devalues a lot more than just identical duplicate content. Google has literally discussed methods for calling your content uninteresting. Once determining that a set of articles are related, such a process identifies which are more descriptive, unique, and/or weird (in a good way). It then rewards these "information nuggets" as part of a "novelty score".

Source(s): Patent US 8140449 B1

Novel Content against Self

Google patents suggest that the genuine uniqueness/weirdness of content, as well as how elaborately that content speaks, determines something known as a "novelty score". This is done by quantifying/qualifying "information nuggets" within text. We pretty much know only that Google's methods for novelty scoring requires comparing many individual documents. Considering that duplicate content is weighed both internally and externally, however, novelty scores likely are as well.

Source(s): Patent US 8140449 B1

Sitewide Average Novelty Score

Kumar and Bharat's patent titled "Detecting novel document content" describes how single documents may be scored on how "novel" (that's an adjective) they are. Assigning an average novelty scores sitewide also appears to fit the narrative of other known sitewide factors such as sitewide thin content (Panda algorithm behavior) and sitewide expert relevance (Hilltop algorithm behavior).

Source(s): Patents US 8140449 B1, US 8825645 B1, Speculation

Quantity of Comments

We know from countless sources and even certain Search Console messages that Google can separate user-generated content and analyzes it differently. One theory suggests that Google might look at quantities of comments on content to help rate content quality. At present, however, there is no clear evidence for this factor beyond maybe fitting an "if I were Google" narrative. Speculatively, it would also be one of the easiest factors to game.

Source(s): Speculation

Positive Sentiment in Comments

It's theorized that Google looks at blog comment opinions to determine the quality of content. There is a patent and confirmation from Google that they score the sentiment expressed towards an entire site in product reviews. But according to Amit Singhal, they're not able to apply this to content, because "if we demoted web pages that have negative comments against them, you might not be able to find information about many elected officials".

Source(s): Amit Singhal, Patent US 7987188 B2

Using rel="hreflang"

There's no definitive evidence (that we're aware of) behind the HTML tag, <link rel="hreflang">, alone, helping you rank better. But, there is apparently a benefit for defining clearer signals behind different regional/language variations of a site. Often, multiple such signals are beneficial.

Source(s): Google

Negative On-Page Factors

Negative Ranking Factors are things you can do that harm your existing rankings. These factors fit into three categories: accessibility, devaluations, and penalties. Accessibility issues are just stumbling points for Googlebot that could prevent your site being crawled or analyzed properly. A devaluation is an indicator of a lower quality website and may prevent yours from getting ahead. A penalty is far more serious, and may have a devastating effect on your long-term performance in Google. Once again, on-page factors are those that are under your direct control as a part of the direct management of your website.

Negative On-Page Factors

High Body Keyword Density

Keyword Stuffing penalties arise when abusing a once extremely effective tactic: sculpting Keyword Density to a high level. Our own experiments have shown that penalties can happen as early as 6% density, though TF-IDF (covered earlier) is likely at play and this is sensitive to topics, word types, and context.

Source(s): Matt Cutts, Remix

Keyword Dilution

This factor manifests itself from logic: if a higher Keyword Density or TF-IDF is positive, at some point, a total lack of frequency/density will decrease relevance. As Google has improved at understanding natural language, this may be better described as Subject Matter Dilution: writing content that wanders without any clear theme. The same basic concept is at play either way.

Source(s): Matt Cutts

Keyword-Dense Title Tag

Aside from a page as a whole, Keyword Stuffing penalties appear to be possible within the title tag. An ideal title tag should definitely be less than 60-70 characters and hopefully still provide enough value to function as a good search ad in Google's results. At absolute minimum, there is no benefit in using the same keyword five times in the same tag.

Source(s): Matt Cutts

Exceedingly Long Title Tag

Aside from a page as a whole, Keyword Stuffing penalties appear to be possible within the title tag. An ideal title tag should definitely be less than 60-70 characters and hopefully still provide enough value to function as a good search ad in Google's results. At absolute minimum, there is no benefit in using the same keyword five times in the same tag.

Source(s): Matt Cutts

Keyword-Dense Heading Tags

Heading Tags, such as H1, H2, H3, etc. can add additional weight to certain words. Those attempting to abuse this positive ranking factor will find that they can't simply cram as many keywords as they can into these tags, even if the tags themselves grow to be no lengthier than usual. Keyword Stuffing penalties appear to be possible simply as a function of the total space within these tags.

Source(s): Matt Cutts

Heading Tag (H1, H2, etc.) Overuse

As a general rule, if you want a concrete answer of whether or not an SEO penalty exists, try pushing a positive ranking factor well beyond what seems sane. One easily verified penalty involves placing your entire website in an H1 tag. Too lazy for that? Matt Cutts drops a less-than-subtle hint about too much text in an H1 in this source.

Source(s): Matt Cutts (again)

URL Keyword Repetition

While there doesn't seem to be any penalties associated with using a word in a URL multiple times, the value added from keyword repetition in a URL appears to be basically nothing. This can be verified very simply by placing a word in a URL five times instead of just once.

Source(s): Speculation

Exceedingly Long URLs

Matt Cutts notes that after about five words, the additional value behind words in a URL dwindles. It's theorized and pretty replicable that this occurs in Google as well, although directly unconfirmed. Although they operate somewhat differently, Bing has also gone out of their way to confirm URL keyword stuffing is a penalty in their engine.

Source(s): Matt Cutts

Keyword-Dense ALT Tags

Given that ALT tag text is not generally directly visible on the page, ALT tag keyword stuffing has been widely abused. A few descriptive words are fine and actually ideal, but doing more than this can invite penalties.

Source(s): Matt Cutts

Exceedingly Long ALT Tags

Given that ALT tag text is not generally directly visible on the page, ALT tag keyword stuffing has been widely abused. A few descriptive words are fine and actually ideal, but doing more than this can invite penalties.

Source(s): Matt Cutts

Too Much "List-style" Writing

Matt Cutts has suggested that any style of writing that just lists a lot of keywords could also fit the description keyword stuffing. Example: listing way too many things, words, wordings, ideas, notions, concepts, keywords, keyphrases, etc. is not a natural form of writing. Too much of this sort of thing will draw devaluations and possibly penalties.

Source(s): Matt Cutts

JavaScript-Hidden Content

Although Google recommends against putting text in JavaScript as it is unreadable by search engines, that does not mean that Google does not crawl JavaScript. In extreme instances where JavaScript may be used to cloak non-JavaScript on-page text, it may still be possible to receive a cloaking penalty.

Source(s): Google

CSS-Hidden Content

One of the first and most well-documented on-page SEO penalties- intentionally hiding text or links from users, especially for the sake of loading the page up with keywords that are just for Google, can invite a nasty penalty. Some leeway appears given in legitimate circumstances like when using tabs or tooltips.

Source(s): Google

Foreground Matches Background

Another common issue that brings about cloaking penalties occurs when the foreground color matches the background color of certain content. Google may use their Page Layout algorithm for this to actually look at a page visually and prevent false positives. In our experience, this can still occur accidentally in a handful of scenarios.

Source(s): Google

Doorway Pages

A site that makes use of Doorway Pages, or Gateway Pages, describes creating masses of pages that are intended to be search engine landing pages, but do not provide value to the user. An example of this would be creating one product page for every city name in America, resulting in what's known as spamdexing, or spamming Google's index of pages.

Source(s): Google

Overuse Bold, Italic, or Other Emphasis

At minimum, if you place all the text on your site within a bold tag, for the reason that such text is often given additional weight compared to the rest of the page, you haven't cracked some code that just makes your whole site rank better. This sort of activity fits Google's frequent blanket description of "spammy activity", and we have verified such penalties in our own non-public studies for clients.

Source(s): Matt Cutts

Text in Images

Google has come a long way at analyzing image, but on the whole, it's very unlikely that text that you present in rich media will be searchable in Google. There's no direct devaluation or penalty when you put text in an image, it just prevents your site from having any chance to rank for these words.

Source(s): Matt Cutts

Text in Video

Just like with images, the words that you use in video can't be reliably accessed by Google. If you are publishing video, it's to your benefit to always publish a text transcript such that the content of your video is completely searchable. This is true regardless of rich media format, including HTML5, Flash, SilverLight, and others.

Source(s): Matt Cutts

Text in Rich Media

Google has come a long way at analyzing images, videos, and other formats of media such as Flash, but on the whole, it's very unlikely that text that you present in rich media will be searchable in Google. There's no devaluation or penalty here,

Source(s): Matt Cutts

Frames/Iframes

In the past, search engines were entirely unable to crawl through content located in frames. Though they've overcome this weakness to an extent, frames do still present a stumbling point for search engine spiders. Google attempts to associate framed content with a single page, but it's far from guaranteed that this will be processed correctly.

Source(s): Google

Dynamic Content

Dynamic content can create a number of challenges for search engine spiders to understand and rank. Using noindex and minimizing use of such content, especially where accessible by Google, is believed to result in a more positive overall user experience and likely to draw preferential treatment in rankings.

Source(s): Matt Cutts

Thin Content

Although it's always been better to write more elaborate content that covers a topic thoroughly, the introduction of Nanveet Panda's "Panda" algorithm established a situation where content with basically nothing of unique value would be severely punished in Google. An industry-wide recognized case study on Dani Horowitz's "DaniWeb" forum profile pages serves as an excellent example of Panda's most basic effects.

Source(s): DaniWeb Study

Domain-Wide Thin Content

For a very long time, Google has made an effort to understand the quality and unique value presented by your content. With the introduction of the Panda algorithm, this became an issue that was scored domain-wide, rather than on a page-by-page basis. As such, it's now usually beneficial to improve the average quality of content in search engines, while using 'noindex' on pages that are doomed to be repetitive and uninteresting, such as blog "tag" pages and forum user profiles.

Source(s): Google

Too Many Ads

Pages with too many ads, especially above-the-fold, create a poor user experience and will be treated as such. Google appears to base this on an actual screenshot of the page. This is a function of the Page Layout algorithm, also briefly known as the Top Heavy Update.

Source(s): Google

Use of Pop-ups

Although Google's Matt Cutts answered no to this question in 2010, Google's John Mueller said yes in 2014. After weighing both responses and understanding the process behind the Page Layout algorithm, our tie-breaking ruling is also "yes": using pop-ups can definitely harm your search rankings.

Source(s): Google

Duplicate Content (3rd Party)

Duplicate content that appears on another site can bring about a significant devaluation even when it's not in violation of copyright guidelines and properly cites a source. This falls in line with a running theme: content that is genuinely more unique and special against a backdrop of the web as a whole will perform better.

Source(s): Google

Duplicate Content (Internal)

Similar to when content duplicated from another source, any snippet of content that is duplicated within a page or even the site as a whole will endure a decrease in value. This is an extremely common issue and can creep up from anything ranging from too many indexed tag pages to www vs. non-www versions of the sites to variables appended to URLs.

Source(s): Google

Linking to Penalized Sites

This was introduced as the "Bad Neighborhood" algorithm. To quote Matt Cutts: "Google trusts sites less when they link to spammy sites or bad neighborhoods". Simple as that. Google has suggested using the rel="nofollow" attribute if you must link to such a site. To quote Matt again: "Using nofollow disassociates you with that neighborhood."

Source(s): MC: Bad Neighbors, MC: Nofollow

Slow Website

Slow sites will not rank as well as fast ones. This factor is executed with the target audience in mind, so seriously consider the geography, devices, and connection speeds of your audience. Google has repeatedly suggested "under two seconds", and says that they aim for under 500ms.

Source(s): Google

Page NoIndex

If a page contains the meta tag for "robots" that carriers a value "noindex", Google will never place it in its index. If used on a page that you want to rank, it's a bad thing. It can also be a good thing when removing pages that will never be good for Google users, and elevate the average experience on visitor arriving from Google.

Source(s): Logic

Internal NoFollow

This can appear two ways: if a page contains the "robots" meta tag with the value "nofollow", it will imply that the rel="nofollow" attribute is added to every link on the page. Or, it can be added to individual links. Either way, this is taken to mean "I don't trust this", "crawl no further", and "do not give this PageRank". Matt does not mince words here: just never "nofollow" your own site.

Source(s): Matt Cutts

Disallow Robots

If your site has a file named robots.txt in the root directory with a "Disallow:" statement followed by either "*" or "Googlebot", your site will not be crawled. This will not remove your site from the index. But it will prevent any updating with fresh content, or positive ranking factors that surround age and freshness.

Source(s): Google

Poor Domain Reputation

Domain names maintain a reputation with Google over time. Even if a domain changes hands and you are now running an entirely different web site, it's possible to suffer from webspam penalties incurred by the poor behavior of previous owners.

Source(s): Matt Cutts

IP Address Bad Neighborhood

While Matt Cutts has gone out of his way to debunk the long-standing practice of "SEO web hosting" on dedicated IP addresses serving any real benefit, this is contradicted by the notion that in rare cases, Google has penalized entire server IP ranges where they might be associated with a private network or bad neighborhood.

Source(s): Matt Cutts

Meta or JavaScript Redirects

A classic SEO penalty that isn't too common anymore; Google recommends not using meta-refresh and/or JavaScript timed redirects. These confuse users, induce bounce rates, and are problematic for the same reasons as cloaking. Use a 301 (if permanent) or 302 (if temporary) redirect at the server level instead.

Source(s): Google

Text in JavaScript

While Google continues to improve at crawling JavaScript, there's still a fair chance that Google will have trouble crawling content that's printed using JavaScript, and further concern that Googlebot won't fully understand the context of when it gets printed and to whom. While printing text with JavaScript won't cause a penalty, it's an undue risk and therefore a negative factor.

Source(s): Matt Cutts

Poor Uptime

Google can't (re)index your site if they can't reach it. Logic also would dictate that a site that's unreliable also leads to a poor Google user experience. While one outage is unlikely to be devastating to your rankings, achieving reasonable uptime is important. One or two days should be fine. More than this will cause problems.

Source(s): Matt Cutts

Private Whois

While it's often pointed out that Google can't always access whois data from every registrar, Matt Cutts made it clear at PubCon 2006 that they were still looking at this data, and that private whois, when combined with other negative signals, may lead to a penalty.

Source(s): Matt Cutts

False Whois

Similar to private whois data, it's been made clear that representatives from Google are aware of this common trick and treating it as a problem. If for no reason other than it being a violation of ICANN guidelines, and potentially allowing a domain hijacker to steal your domain via a dispute without you getting a say, don't use fake information to register a domain.

Source(s): Matt Cutts

Penalized Registrant

If you subscribe to the notion that private and false whois records are bad, and take into account that Matt Cutts has discussed using this as a signal that identifies webspam, it stands to reason that a domain owner can be flagged and penalized across numerous sites. This is unconfirmed and purely speculative.

Source(s): Speculative

ccTLD in Global Ranking

ccTLDs are country-specific domain suffixes, such as .uk and .ca. They are the opposite of gTLDs, which are global. These are useful in executing international SEO, but can be equally problematic when attempting to rank outside of these countries. An exception to this rule is that a small number of ccTLDs have been widely used for other purposes such as .co, and have been labeled by Google as "gccTLDs".

Source(s): Google

Invalid HTML/CSS

Matt Cutts has said no to this being a factor. Despite this, our experience has consistently indicated yes. Code likely doesn't have to be perfect and this may be an indirect effect. But the negative effects of bad code are supported by logic as you consider other code-related factors (hint: there's a code filter up top). Bad code can cause countless, potentially invisible issues including tag usage, page layout, and cloaking.

Source(s): Matt Cutts

Parked Domain

A parked domain is a domain that does not yet have a real website on it; often sitting unused at a domain registrar outside of some machine-generated advertising. Anymore, this fails to meet so much other ranking criteria that it probably wouldn't have much success in Google anyway. They once had some. But Google has repeatedly made it clear that they don't want to rank parked domains of any kind.

Source(s): Google

Search Results Page

Generally speaking, Google wants users to land on content, not other pages that look like listings of potential content, like the Search Engine Results Page (SERP) that such a user just came from. If a page looks too much like a search results page, by functioning as just an assortment of more links, it's likely to not rank as well. This may also apply to blog posts outranking tag/category pages.

Source(s): Matt Cutts

Automatically Generated Content

Machine-generated content that's based upon user search query will "absolutely be penalized" by Google and is considered a violation of the Google Webmaster Guidelines. There are a number of methods that could qualify which are detailed in the Guidelines. Once exception to this rule appears to be machine-generated meta tags.

Source(s): Matt Cutts, Webmaster Guidelines

Infected Site

Many website owners would be surprised to know that most compromised web servers are not defaced. Often, the offending party will actually go so far as to patch your security holes to protect their newfound property, without you ever knowing. This will then manifest itself in the form of malicious activity enacted on your behalf such as virus/malware distribution and further exploits, which Google takes very seriously.

Source(s): Webmaster Guidelines

Phishing Activity

If Google might have reason to confuse your site with a phishing scheme (such as one that aims to replicate another's login page to steal information), prepare for a world of hurt. For the most part, Google simply uses a blanket description of "illegal activity" and "things that could hurt our users", but in this interview, Matt specifically mentions their anti-phishing filter.

Source(s): Matt Cutts

Outdated Content

A Google patent exists surrounding stale content, which is identified in a variety of ways. One such method for defining stale content basically just surrounds being old. What is unclear is whether this factor harms rankings on all queries, or simply when a particular search query is associated with something Google refers to as Query Deserves Freshness (QDF), which means exactly what it sounds like.

Source(s): Patent US 20080097977 A1

Orphan Pages

Orphan pages, meaning pages of your site that are difficult or impossible to find using your internal link architecture, can be treated as Doorway Pages and act as a webspam signal. At minimum, such pages likely do not benefit from internal PageRank, and therefore have far less authority.

Source(s): Google Webmaster Central

Sexually Explicit Content

While Google does index and return X-rated content, it's not available when their Safe Search feature is turned on, which is Google's default state. It's therefore reasonable to consider that unmoderated user-generated content or one-time content that inadvertently crosses a certain line may be blocked by the Safe Search filter.

Source(s): Google Safe Search

Subdomain Usage (N)

Subdomains (thing.yoursite.com) are often viewed as separate websites by Google, as compared to subfolders (yoursite.com/thing/), which are not. This can be negative in a number of ways as it relates to other factors. One such scenario would involve a single, topical site with many subdomains, not benefiting from factors on this page that have "domain-wide" in their names.

Source(s): Matt McGee and Paul Edmondson

Number of Subdomains

The number of subdomains on a site appears to be the most significant factor in determining whether subdomains are each treated as their own sites. Using an extremely large number of subdomains, although not a terribly easy thing to do by mistake, could theoretically cause Google to treat one site like many sites, or many sites like one site.

Source(s): Speculation

HTTP Status Code 4XX/5XX on Page

If your web server returns pretty much anything other than a status code of 200 (OK) or 301/302 (redirect), it is implying that the appropriate content was not displayed. Note that this can happen even if you are able to view the intended content yourself in your browser. In cases where content is actually missing, it's been clarified by Google that a 404 error is fine and actually expected.

Source(s): Speculation

Domain-wide Ratio of Error Pages

Presumably, the possibility for users to land on pages that return 4XX and 5XX HTTP errors is a sure mark of an overall low-quality website. We speculate this is a problem in addition to pages that are not indexed due to carrying such a HTTP header, and pages that include broken outbound links.

Source(s): Speculation

Code Errors on Page

Presumably, if a page is full of errors generated by PHP, Java, or other server-side language, it meets Google's definitions of a poor user experience and a low quality site. At absolute minimum, error messages within the page text likely interfere with Google's overall analysis of the text on the page.

Source(s): Speculation

Soft Error Pages

Google has repeatedly discouraged the use of "soft 404" pages or other soft error pages. These are basically error pages that still return HTTP code 200 in the document headers. Logically, this is difficult for Google to process correctly, and even though your users see an error page, Google (may at minimum) treat these as actual low-quality pages on your site, significantly lowering how the overall quality of your domain's content is scored.

Source(s): Google

HTTP Expires Headers

Setting "Expires" headers with your web server can control browser caching and improve performance. Unfortunately, depending on how they're wielded, they can also cause problems with search indexing, by telling search engines that content will not be fresh again for potentially a long time. In all cases, they may tell Googlebot to go away for longer than desired, as their analysis seeks to emulate a real user experience.

Source(s): Moz Discussion

Sitemap Priority

Many theorize that the "priority" attribute assigned to individual pages in an XML sitemap has an impact on crawling and ranking. Much like other signals that you might hand to Google via Search Console, it seems unlikely that some pages would really rank higher just because you asked, and is mainly useful as a signal to de-prioritize lesser important content.

Source(s): Sitemaps.org

Sitemap ChangeFreq

The ChangeFreq variable in an XML sitemap is intended to indicate how often the content changes. It's theorized that Google may not re-crawl content faster than you tell it is changing. It's unclear however if Google actually follows this attribute or not, but if they do, it seems that it would yield a similar result as adjusting the crawl speed in Google Search Console.

Source(s): Sitemaps.org

Keyword-Stuffed Meta Description

It's theorized that, even though Google now tells us that they don't use meta descriptions in web ranking, only for ads, it may still be possible to send webspam signals to Google if there's an apparent attempt to abuse the tag.

Source(s): Speculation

Keyword-Stuffed Meta Keywords

Since 2009, Google has said that they don't look at meta keywords at all. Despite this, the tag is still widely abused by people who don't understand or believe that idea. It's theorized that because of the latter fact, this tag may yet serve to send webspam signals to Google.

Source(s): Matt Cutts

Spammy User-Generated Content

Google should single out problems appearing in the user-generated portions of your site and issue very targeted penalties in such a context. This is one of few circumstances where a warning may appear in Google Search Console. We're told these penalties are usually limited to certain pages. We've found that WordPress trackback spam appearing in a hidden DIV is one way that this penalty can creep up undetected.

Source(s): Matt Cutts

Foreign Language Non-Isolation

Obviously, if you write in a language that doesn't belong to your target audience, almost no positive, on-page factors can work their charm. Matt Cutts admits that improperly isolated foreign language content can be a stumbling point both for search spiders and for users. To not interfere with positive ranking factors, Google needs to be able to interrelate content on the page as well as sections of a site.

Source(s): Matt Cutts

Auto-Translated Text

Using Babelfish or Google Translate to rapidly "internationalize" a site is a surprisingly frequent practice for something that Matt Cutts explicitly states is a violation of their Webmaster Guidelines. For those fluent in Google-speak, that usually means "it's not just a devaluation, it's a penalty, and probably a pretty bad one". In a Google Webmaster video, Matt categorizes machine translations as "auto-generated content".

Source(s): Matt Cutts

Missing Robots.txt

As of 2016, Google Search Console advises site owners to add a robots.txt file to their site when one is missing. This has lead many to theorize that a missing robots.txt file is bad for rankings. We consider this is odd while Google Search's John Mueller advises removing robots.txt entirely when Googlebot is entirely welcome. We chalk this myth up to department miscommunication.

Source(s): John Mueller via SER

All nofollow

In an impressively inconclusive video, Matt Cutts tells us that Google "would like to see" sites like Wikipedia hand-selecting a few links to not be "nofollow", but never states the value. The apparent ranking success of sites with 100% "nofollow" on their outbound links, like Wikipedia, seems to suggest that there's no significant harm done. If anything at all, they may lose some positive value attributed to good outbound links.

Source(s): Matt Cutts

Site Lacks Theme

One of the most popular case studies following Panda's launch was of HubPages, who ultimately repaired their damage by using subdomains to isolate many unrelated sites from one. While the Hilltop update apparently began rewarding domains for having a core expertise in 2004, Panda apparently began punishing a lack thereof in 2011.

Source(s): Paul Edmundson (HubPages)

Weak SSL Ciphers

SSL encryption is confirmed as a positive factor. This suggests that Google wants to reward superior security for their users. So is it possible that Google is rewarding the quality of security as well? It would be incredibly easy for Google to test SSL ciphers - even easier than current, confirmed malware tests. But at present, we have no evidence beyond it being a logical fit.

Source(s): Speculation

X-Robots-Tag HTTP Header

While the most common ways to block search engine crawlers are within your HTML, or a separate robots.txt file, it's also possible at the server level. Used correctly, this can be useful for blocking thin content. But if unintended, as the obscure nature of this approach more often is (in our experience), the consequences here are more often negative.

Source(s): Google Developers

Commercial Queries (YMYL)

Google frequently uses the phrase "commercial queries" to refer searches related to a transaction. The Quality Rater Guidelines ask QA auditors to identify "Your Money or Your Life (YMYL)" content, discussing a heightened concern for legitimacy on searches related to money and health. It's not certain what the full impact of this is on the search algorithm, but a Google search for "commercial queries" can be found relating this concept to several other signals.

Source(s): Matt McGee via SEL

Positive Off-Page Factors

Off-Page Factors describe events that take place somewhere other than on the site that you directly control and are trying to improve performance of in the rankings. This usually takes the form of backlinks from other sites. Positive Off-Page Factors generally relate to an attempt to understand honest, natural popularity, with a large emphasis on popularity achieved from more-trusted and influential sources.

Positive Off-Page Factors

Social Signals

This phrase, dubbed by Google, refers to ongoing experimentation with sharing and reputation on social media to further appraise the authority of a site. After launching Google+ and ending their firehose agreement with Twitter, Matt Cutts says this isless of a thing as they experimented with Google+ data. Recent studies still confirm that positive social reputation correlates, directly or indirectly, with better rankings.

Source(s): Matt Cutts, Moz Study

Keyword Anchor Text

The anchor text used in an external link will help establish relevance of a page towards a search term. The target page does not need to contain this term to rank (see: Google Bombing).

Source(s): Patent US 8738643 B1

Keyword ALT Text

Keywords used in the ALT attribute of an image are treated as anchor text. Short, genuinely descriptive ALT tags also improve overall accessibility and have an exceedingly strong impact on images appearing in-line with searches from Google Image Search.

Source(s): Patent US 8738643 B1, Matt Cutts

Brand Name Citation

A major factor of local SEO, or Google Maps SEO, are local citations: brand mentions with company name, address, phone, but no backlink. Rand from Moz noted a case study that he believed supported speculation that this was making its way into "traditional SEO" as well. This study was, however, debunked by several comments without rebuttal, so for now, we consider it a myth.

Source(s): Moz Study

DMOZ Listing

Of all the websites on the Internet you could get a backlink from, one magical opportunity defies the laws that the rest abide by. That's DMOZ: Directory Mozilla, The Open Directory Project, once the data feed for The Google Directory. It's a political nightmare ripe with corruption, but it's human-edited, and when you finally get listed the effects are noticeable.. even in 2015.

Source(s): Matt Cutts

Click Through Rate on Query/Page

It's been heavily theorized that Click Through Rate from the results page is a ranking factor. It's a Bing ranking factor. Matt glossed over ranking implications in 2009. Repeatedly, Rand Fishkin has used Twitter to lead experiments which look surprisingly conclusive at confirming that CTR is a ranking factor.

Source(s): Moz Study, Patent US 9031929 B1

Click Through Rate on Domain

A patent by Nanveet Panda (of the Panda algorithm) describes assigning site quality scores based on CTR for various searches. The title of this patent is literally "Site quality score". It also speaks of branded search queries, followed by clicks as the primary method. Still, these factors, in addition to evidence for search query CTRs as a factor, seems to suggest that sitewide CTR may be a factor.

Source(s): Patent US 9031929 B1

Low Bounce Rate

It's been theorized that Google looks at search user bounce rate as a ranking factor. Even without Google Analytics or Chrome data this could be easily measured in several ways. Matt Cutts says no, and that tracking how long users remain on a page would be "spammable and noisy". Yet, SEO Black Hat and Rand Fishkin have run studies that indicate otherwise, and Bing's Duane Forrester has clearly confirmed that Bing uses it; a factor that they call "dwell time".

Source(s): Matt Cutts via SER

Google+ Profile

Although a somewhat unpredictable factor making use of Google+ can carry with it a variety of ranking benefits. Although some speculate that Google+ could help with traditional rankings, however, we believe that the only real benefits as of writing this manifest themselves in non-traditional ways. For examples of these rankings that may be achieved with Google+, see Dr. Pete Meyers 2013 MozCon Presentation, Beyond 10 Blue Links.

Source(s): Dr. Pete’s Study

Twitter Followers

It's been heavily theorized that a brand's number of Twitter followers might be a direct ranking factor. Claims from Google, however, are to the contrary. While it's true that a Twitter audience is an invaluable asset for nurturing a community of brand advocates, that manifests other benefits in the way of long-term drip marketing, word of mouth sales, and backlinks from your content, all evidence indicates that Google is not currently looking at this information.

Source(s): Matt Cutts

Twitter Sharing

According to Google, shares on social media are basically just treated like more backlinks, and there is no additional, direct benefit at the present that comes from content being shared on Twitter. In 2010, Google told Danny Sullivan "who you are on Twitter matters". In 2014, Matt Cutts said "to the best of his knowledge", nothing like this existed.

Source(s): Matt Cutts, Danny Sullivan

Facebook Likes

It's been heavily theorized that a brand's number of Facebook page "likes" might be a direct ranking factor. Claims from Google, however, are to the contrary. While it's true that a Facebook audience is an invaluable asset for nurturing a community of brand advocates, that manifests other benefits in the way of long-term drip marketing, word of mouth sales, and backlinks from your content, all evidence indicates that Google is not currently looking at this information.

Source(s): Matt Cutts

Facebook Sharing

According to Google, shares on social media are basically just treated like more backlinks, and there is no additional, direct benefit at the present that comes from content being shared on Facebook. In 2010, Google told Danny Sullivan "who you are on Twitter matters". In 2014, Matt Cutts said "to the best of his knowledge", nothing like this existed.

Source(s): Matt Cutts, Danny Sullivan

Google+ Circles

For a short while, Google's "Search, plus Your World" campaign enabled social search functionality that put personalized search results into overdrive. Massive ranking preference was given to documents/sites +1'd by your Circlers. It appears now that this was deemed a failed experiment. Although there's no harm in handing Google more positive signals for the future, there's also no evidence that Circles matter right now.

Source(s): Google

Google+ "+1's"

Google began experimenting with the Google+ "+1" button most everywhere when it rolled out. We all thought it could impact rankings. Cyrus Shepard posted a correlation study on Moz that showed that sites that were popular enough to get more +1's ranked better. Matt Cutts told us all of that was bogus and that +1s don't directly impact rankings a day later. For now, we call this "iffy", bordering on "myth".

Source(s): Moz correlation study, Matt Cutts via SER

Query Deserves Freshness (QDF)

Google doesn't rank every search query the same way. Certain search queries, especially those that are news-related, are especially sensitive to the freshness of content that they will publish (and may only rank content that is recent). Google's term for this is Query Deserves Freshness (QDF).

Source(s): Matt Cutts, Amit Singhal

Query Deserves Sources (QDS)

A phrase that we've coined to cover a scenario described in Google's Quality Rater Guidelines, used when humans conduct quality control on Google search results. This asks: "this is a topic where expertise and/or authoritative sources are important". Presumably, this applies to all informational search queries (in contrast to transactional and navigational queries).

Source(s): Barry Schwartz

Query Deserves Oldness (QDO)

This is a phrase that we made up to describe a situation detailed in a Google patent. It's specifically noted that: "For some queries, older documents may be more favorable than newer ones." The patent then goes on to describe the process in which documents would be ranked by their age, as a function of the average age of results for that query.

Source(s): Patent US 8549014 B2

Query Deserves Diversity (QDD)

Certain search queries are ranked differently by Google. One theory is called Query Deserves Diversity, likely dependent on a concept called entity salience by attaching meaning to the same word with differing definitions. As a bit of a riff on the concept of Query Deserves Freshness, this would be similar to a Wikipedia disambiguation page, where the search query is vague and a variety of result types are needed at the top of the results. Unconfirmed, but easily replicated.

Source(s): Rand Fishkin

Use AdWords

SEO paranoia seems to prevent this myth from dying. There are no credible studies that we have encountered that suggest AdWords will improve rankings in any way. AdWords influencing organic rankings runs counter to Google's core philosophies, and nobody is more vigilant about speaking out against this myth more than Google.

Source(s): Matt Cutts

Don't Use AdWords

Just like using AdWords is allegedly a ranking factor in some very un-scientific circles, as is not using AdWords. The notion that AdWords can have any influence on Google's organic rankings in any way, now or in the future, has been dispelled by Google maybe more aggressively than any other SEO myth.

Source(s): Matt Cutts

Chrome Page Bookmarks

Although directly denied by Matt Cutts, this was affirmed at the 2013 BrightonSEO conference during the ex-Googler fireside. It's also suggested by a Google Patent, which states: "Search engine may then analyze over time a number of bookmarks/favorites to which a document is associated to determine the importance of the document."

Source(s): BrightonSEO Fireside

Chrome Site Traffic

Also denied by Google, the Patent "Document scoring based on traffic associated with a document" also touches on using browser traffic data for the purposes of ranking sites, stating: "information relating to traffic associated with a document over time may be used to generate (or alter) a score associated with the document."

Source(s): Patent US 20070088693, Lifehacker Analysis

User Search History

It's common to be served personalized search results based on your search history unless you have specifically disabled this feature in Google. As of 2009, signing into a Google account is not a requirement for being served results that are personalized based upon your recent search history.

Source(s): Brian Horling

Google Toolbar Activity

Just as Matt Cutts stated that Google Chrome data is not used in determining rankings in Google's organic search results, the same was said for the Google Toolbar. Despite this, it's widely reported by SEOs, which may relate to a Google Patent that directly discusses a method of doing exactly this via a browser plugin.

Source(s): Matt Cutts via SER, Patent US 20070088693

Low Alexa Score

While there are patents and speculation that suggest that Google could theoretically look at site traffic as a ranking factor, there's absolutely no evidence to support that they are doing so using Alexa at present. In what documentation does exist, it's suggested that they would do this using Chrome data, which by the way, they've totally cleared themselves to do.

Source(s): Patent US 20070088693

High MozRank/MozTrust Score

The "toolbar PageRank" scores that we see do not match the actual PageRank data that Google Search uses. That data is often wildly inaccurate these days, and that's led many to defer to MozRank. Despite this, Google has always been in the business of computing the value of links on their own, and while Moz data might correlate, it's not related to rankings in any way. The same goes for any other third party metrics like Majestic or Ahrefs.

Source(s): Speculation

Total Branded Searches + Clicks

Nanveet Panda's patent titled "site quality score" describes a scenario where navigational brand searches in Google (such as "Northcutt contact page") contribute to a domain-wide quality score. It states: "The score is determined from quantities indicating user actions of seeking out and preferring particular sites and the resources found in particular sites."

Source(s): Patent US 9031929 B1

High Dwell Time (Long Clicks)

The "Site quality score" patent describes a scenario that rewards branded searches + clicks as a ranking factor. As a part of their methods, it also states: "Depending on the configuration of the system .... a click of at least a certain duration, or a click of at least a certain duration relative to a resource length, for example, may be treated by the system as being a user selection." It's also supported by several other sources and used by Bing and Yahoo.

Source(s): Patent US 9031929 B1, Bill Slawski

Submit Site to Google

Google has long had a tool that allowed you to submit your site to be crawled. A long-standing myth is that this provides any ranking benefits whatsoever. In fact, in cases where a site is not even in the index, it almost appears to be a placebo button. For your site to rank, on top of Google simply being award Google will need to instead find it using some worthwhile links.

Source(s): Google

Submit Sitemap Tool

It's possible to submit an XML Sitemap to Google using Google Search Console. This does appear to get more pages into the index in some cases, but for similar reasons as the raw "submit site" concept is not ideal, neither is the "Submit Sitemap". If Google couldn't find them on its own, they're likely doomed never to rank. And as Rand Fishkin points out, this tool stops many diagnostic processes cold.

Source(s): Rand Fishkin

International Targeting Tool

Google Search Console provides a tool for international targeting when it may not be done correctly otherwise, mainly for use with a generic TLD like ".com", or "gccTLDs" like .co that were intended for a particular country, but widespread use has caused Google to treat them more generically. This can help with rankings in certain countries in certain situations.

Source(s): Google

Reconsideration Requests

Google's reconsideration request tool is generally the answer to a manual action. This tool essentially petitions Google to have someone manually review a site to determine whether or not a manually placed penalty should be removed. Considering that manual actions make up an extremely small portion of negative ranking factors, this tool should rarely be necessary.

Source(s): Google

Google+ Local Verified Address

It's often theorized that a Google+ Local page, in which businesses verify their address using a postcard for listing in Google Maps, is a ranking factor in Google's primary web search results. While true that this is a significant ranking factor for Google Maps searches, and when the local listings box is imposed in-line with traditional Google search results, we've found no evidence to support this theory.

Source(s): Speculation

Crawl Budget

The number of pages that Google will crawl and index on your site is proportional to the overall authority that your site has achieved from inbound links. Lower authority sites have allocated a lesser "crawl budget".

Source(s): Matt Cutts via Eric Enge via AJ Kohn

Android Pay

It's theorized, much like Chrome, Analytics, and other Google resources, that Google could use Android Pay data as a ranking signal. By associating a Google account, they could have knowledge of search queries that led to purchase. This is, however, completely theoretical and without any evidence (for or against) that we're currently aware of.

Source(s): Speculation

Negative Off-Page Factors

Negative Off-Page Factors are generally related to unnatural patterns of backlinks to your site, usually due to intentional link spam. Until the the Penguin algorithm was introduced in 2012, the result of these factors was almost always a devaluation, rather than a penalty. That is, you could lose all, or nearly all, value obtained from linking practices that Google felt may be unnatural, but your site would not be harmed otherwise. While that's still mostly true, Penguin introduced off-page penalties in a number of cases, which has opened the floodgates for malicious behavior from competing sites as a practice known as negative SEO or Google Bowling.

Negative Off-Page Factors

Excessive Cross-Site Linking

When owning multiple sites, it's discouraged to inter-link them for the purpose of inflating your inbound link authority. Risk increases with the number of inter-linked domains. Common ownership may be detected by domain registrant, IP address, similarity of content, similarity of design, and rarely, identified and penalized as part of a manual action. Exception made for Internationalization or "when there's a really good reason, for users, to do it".

Source(s): Matt Cutts

Negative SEO (Google Bowling)

Negative SEO, historically dubbed "Google Bowling", is the act of a malicious linkspam conducted on behalf of your site by a third party. This was once very difficult, since we lived in a world of off-page devaluations, rather than off-page penalties. If a devaluation were to occur, a competitor could only exaggerate existing schemes, causing value to be lost sooner or more assuredly. If off-page penalties exist, which they do, negative SEO is proven by logic alone.

Source(s): Matt Cutts

Fresh Anchor Text

The age of anchor text used in a link, specifically anchor text that appears to be changing on another site, can signify a problem. Speculatively, this implies that the link is not actually from a third party and/or an active experiment in ranking manipulation.

Source(s): Patent US 8549014 B2

Diluted Page Authority

As a function of the PageRank algorithm, every link on a page divides the overall authority that is passed to the pages that are linked. For example, one page with one link may pass a hypothetical PageRank value of 1.0, whereas an identical page with 1,000 outbound links would pass 0.001.

Source(s): Matt Cutts, Larry Page

Diluted Domain Authority

For nearly the same reason that diluted page authority is possible, it's possible for an entire domain to dilute outbound PageRank. For this reason, sites that are more choosy about who they link to, relative to who links to them, are valuable, while sites functioning as complete free-for-all link farms have a value near zero.

Source(s): Matt Cutts, Larry Page

Unnatural Ratio of Anchor Text

To an extent, the anchor text used in links establishes relevance of the subject matter. As with every SEO tactic the community abused this to the point they were able, and controls were put in place for when the limits were pushed well beyond what occurs without manipulation. That threshold may be as simple as a flat 10% of a particular anchor text. This is a function of the Penguin algorithm.

Source(s): Moz Study

Unnatural Ratio of Anchor Type

Just as the Moz study showed us a high ratio of one anchor, repeatedly reproduced on our work on Penguin-penalized sites, the same can be said for sites that use too much anchor text overall. Analyzing backlinks across popular brands shows high amounts of brand name anchor text, "click here" anchors, URL anchors, and banners. Pushing too far beyond the limits of what occurs naturally invites devaluations, and since Penguin, potential for penalties.

Source(s): Speculative

Unnatural Variety of Linking Sites

If you subscribe to the notion that Google is ultimately watching for natural trends, and you accept the studies done post-Penguin on sites that were severely penalized for carrying an anchor text greater than 10%, you may also subscribe to the notion that any type of unnatural ratio of off-page activity at scale can hurt you. Although no public case study is available at the time of writing this, we have repeatedly witnessed those practicing otherwise successful black hat SEO getting greedy, taking their scheme too far, and being penalized.

Source(s): Speculation

Webspam Footprints

A "footprint" is an off-page SEO term that describes virtually anything that Google might use to identify activity originating from a common source. This might be a forum username, a person's name, a photo, a guest author biography snippet, some element of a WordPress theme that's involved in a private blog network, and or just about any subtle detail that relates the efforts of a webspam activity. Obviously, a footprint is not always bad, but if a site even slightly runs afoul of Google's Webmaster Guidelines, footprints are often a factor that bring about penalties.

Source(s): Matt Cutts via SEL

Comment Spam

If you engage in blog comment spam - that is, commenting in mass in a repetitive, unnatural format, expect to see these links devalued or penalized as a link scheme. Especially your commenting is machine-driven, with odd keyword anchor text, or leaving behind a footprint of irrelevant or repetitive content. Genuine commentary, on the other hand, is fine and actually encouraged. Mr. Cutts suggests using your real name in such circumstances for good measure.

Source(s): Matt Cutts

Forum Post Spam

Forum posts, like blog comments, are fine and actually good inbound marketing when they add to a conversation and are doing for humans rather than search spiders. John Mueller confirms (amongst countless other sources that have appeared on this one over the years), that they are systematically monitoring for link schemes in the form of bulk forum spam.

Source(s): John Mueller via SER

Advertorials (Native Advertising)

Advertorial content, also known as Native Advertising, is systematically sought out by Google's webspam team, and is considered a paid link. Links in advertorials should be disclosed and given the rel="nofollow" attribute to avert potential for penalties. Presumably, this is a case where "nofollow" is definitely respected. Undisclosed advertorials can also get an entire publication delisted from Google News.

Source(s): Matt Cutts

WordPress Sponsored Themes

On top of the low value that sitewide footer links now carry, it definitely seems that Google's webspam team is well aware of the once powerful and now mostly useless tactic of producing WordPress themes with backlinks in them. Such efforts definitely leave an obvious spammy footprint, with similarities to the GWG widget example, and it's clear that Google isn't having it.

Source(s): Matt Cutts

Article Directories

With how far Google has come with punishing domain-wide content scores with Panda and unnatural patterns of links with Penguin, it's unclear if Google would even need to go out of their way to punish these sites anymore. It seems, however, that they do still single out these article directories as an issue as recently as a 2014 Matt Cutts video, however, so expect to see longer-term issues if using these methods.

Source(s): Matt Cutts

Generic Web Directories

Generic web directories were one of earliest link schemes. Matt Cutts goes out of his way to state that they do penalize paid, generic directories as paid links if they are not exercising some editorial discretion. He cites Yahoo!'s paid directory as one that is actually alright. With any link, paid or not, there appears to be a theme: editorial discretion is good, complete free-for-all listings are bad.

Source(s): Matt Cutts via ClickZ

Google Dance

This term describes a temporary shake-up that sometimes accompany Google's ~500 algorithm updates per year. Technically, these effects could be positive or negative, since it's just rearranging rankings and someone has to go up for another to go down. But since a Google Dance is always unexpected, we're classifying it as a negative.

Source(s): Danny Sullivan

Manual Action

In spite of every other ranking factor, Google's webspam team will still occasionally take manual action against certain sites which can take half a year to a year to recover from after you've cleaned up the problems. Often, these penalties come with a notification in Google Search Console. For this reason, it's critical to constantly look beyond the functionality of today and ask "what does Google want?" Learn Google's philosophies and market your site in harmony.

Source(s): Matt Cutts

Penalty by Redirect

John Mueller confirms via Google Hangout that organic search penalties can pass through a 301 redirected site. John's confirmation confirms that this being a realistic factor is likely. The notion of this actually occurring in the wild is probably far less, unless you're doing something like buying used domain names in attempt to reclaim their inbound link value or trying to circumvent a manual action.

Source(s): John Mueller via SER

Chrome Blocked Sites

Google introduced a tool in 2011 that allowed users to block sites in Google search via Chrome. They stated "while we're not currently using the domains people block as a signal in ranking, we'll look at the data and see whether it would be useful". Therefore, there's no guarantee that this is an automated factor in rankings, but we're also not about to believe that nobody on the webspam team is looking at this data.

Source(s): Amay Champaneria

Negative Sentiment

In 2010, Google told us that the sentiment expressed towards a brand, such as in reviews or the text surrounding links, is a ranking factor. Reviews were known to be a huge part of local or "Google Maps SEO" rankings before that. The implications of this are a little complex, but Moz's Carson Ward did a great piece on it.

Source(s): Amit Singhal, Patent US 7987188 B2

Crawl Rate Modification

Google Search Console allows you to modify the rate in which your site is crawled by Google. It's not really possible to speed up Googlebot, but it's certainly possible to slow it down to zero. This can cause problems with indexing, which mean problems for ranking, especially in regards to factors surrounding fresh content and editing.

Source(s): Google

International Targeting Tool

Google Search Console provides a tool for international targeting when it may not be done correctly otherwise. Theoretically, this tool could also cause harm if it is used to restrict your site's appearance in search results to a particular region that does not encompass your entire desired market region.

Source(s): Google

No Editorial Context

Matt Cutts tells us that all links should be published with editorial discretion. But not all links need to be placed in an editorial context - that is, within the middle of a story or article. It takes very little experimentation to see that that higher quality links outside of editorial context, such as a local Chamber of Commerce membership page, help quite a lot with authority. It's also plain to see that this would be an unnatural pattern.

Source(s): Julie Joyce

Microsites

It's been suggested that there's some penalty reserved for microsites: websites with an extremely narrow scope and not a lot of pages. Matt Cutts gives us Google's stance: Microsites aren't hunted and penalized by Google, they're just usually not a very effective tactic as a part of a long-term strategy since sitewide ranking factors will remain weak. They're also not very effective at exploiting Exact Match Domain bonuses using keyword-focused domains anymore.

Source(s): Matt Cutts

Click Manipulation

If you subscribe to the notion that Click Through Rate (CTR) is a positive factor, it's reasonable to suggest that webspam controls exist here too. Rand Fishkin's Twitter CTR experiments present evidence of a page mass-clicked page in his experiment rising from #6 to #1, dropped to #12, before restoring its position, all within the course of a couple days.

Source(s): Rand Fishkin

Brand Search Manipulation

Another theory is that if brand searches are a ranking factor as patents suggest, that webspam controls must also exist here to prevent abuse. Otherwise, this factor would be far too easy to manipulate.

Source(s): Patent US 9031929 B1

Illegal Activity Report

Google has a form that requests users any report illegal activity occurring within their content. This page implies that any such content will be removed from any Google products, including Google Search. We have no reason to doubt them on this one, and don't expect that anyone's going to do an experiment on this factor anytime soon either.

Source(s): Google

DMCA Report

In addition to automated controls for detecting stolen content, un-cited sources, and potential copyright violations, Google also encourages users to send DMCA requests direct to Google. This almost certainly invokes the DMCA process within the United States, during which Google has no choice but to remove any offending context accessible on their domains.

Source(s): Webmaster Tools, DMCA Process

Low Dwell Time (Short Click)

A Google patent suggests seeking "a click of at least a certain duration, or a click of at least a certain duration relative to a resource length" on branded queries. Steven Levy's "In The Plex" first-hand account of Google suggests that this is basically Google's best measure of search result quality. Finally, Bing and Yahoo! both have suggested using dwell time, in some scope, as a ranking factor.

Source(s): Patent US 9031929 B1, Steven Levy (In The Plex), Bill Slawski

High Task Completion Time

We have quite a bit of evidence that Click Through Rate and Dwell Time may be ranking factors, though not directly confirmed. We also know of a research paper co-authored by Google employee David Mease, which describes analyzing the overall time it takes a searcher to find a result that they're happy with and responding with an "alternative experiential design". Is it possible that automated A/B testing will "shake up" the weighting of factors based on how happy users appear with their results?

Source(s): David Mease

GSC URL Parameters

The Google Search Console URL Parameters feature is a potential means to exclude duplicate, or duplicate-ish content from Google in situations where variables might exist in the page's URL. Used correctly, that can be beneficial. By design, however, it removes content from Google index, which is quite literally a negative. Used incorrectly, this can be especially damaging.

Source(s): Maile Ohye


Ready to rank?

We believe that the "why" of SEO is critical. Only with this knowledge can you make great decisions for your brand.

This fact-check was the beginning. After answering the "why", we needed the "what". On an Internet with near infinite SEO tools, Northcutt created a system. That's when the 1,200-step audit emerged. Through 24 step-by-step checklists, we address everything above, and more, tactically.


Try it Free