Why Penguin Comes in Different Degrees (And A Myth Persists)

under Myth-Busting, SEO.

Matt Cutts recently tweeted something that gives us a peak under the hood of Google's Penguin algorithm, an algorithm designed to discourage spammy link building practices. This is what the tweet said:

The fact that you can have a "mild case of Penguin" indicates that Penguin is not an on/off switch. This is further emphasized by a later tweet:

I don't find this surprising at all. In fact, it actually strengthens some of my assumptions about how Penguin works, and it brings us back, once again, to what may be the biggest misconception about Penguin.

Let's talk about what this means for SEO, and why you should care.

What Penguin Recovery Looks Like

As I've discussed before, most Penguin recoveries look pretty tame. They rarely look like the sudden, stark recoveries you might see from somebody who was suffering from a site-wide manual action. In general, they look something like this:

penguin recovery 01

These "recoveries" are very slow and gradual. More importantly, when these sites make it back to their previous level of traffic, it's almost never on the date of an actual Penguin update. In other words, these sites aren't recovering because a Penguin "penalty" has been "lifted." We all know that Penguin data is processed separately from the main algorithm, and only released on specific dates.

As I've argued before, what I believe we're looking at here is simply typical SEO. These sites are building links and/or adding content, and gradually picking up traffic as they do so.

SEOs continue to insist that these recoveries are the result of link removal. They believe that "negative" links are counting against them. I don't have any insider knowledge. It's always possible that this assumption is correct. But I've never seen any compelling evidence to believe that this is true.

As I mentioned in a previous post, Spencer Haws' site makes for a very good example:

penguin recovery 02

When his site was hit by Penguin, it dropped down to about 400 visitors per day. And just before Spencer's site started getting hit with a negative SEO attack, he was getting about 400 visitors per day. The way I see it, when Penguin hit his site, it simply stripped away the value of those spammy links, returning him back to his previous traffic levels.

We'll see what happens on the next Penguin update. If the links really are counting against him, than all of his link removal efforts should cause him to see a dramatic recovery on the date of the next update.

Until then, I'd like to draw your attention to one of the very rare examples of a site experiencing a dramatic recovery on the date of a Penguin update. I'll explain why it happens, and why I don't think link removal is responsible.

Every Once in a While, There IS a Dramatic Recovery

This example comes from a site called SEO-Website-Designer.com. And the recovery looks like this:

penguin recovery 03

Here are the changes site owner Tony McCreath made to try and recover:

  • Stopped doing 301 redirects from an expired domain
  • Dropped an RSS feed in the sidebar
  • Cleaned up the Footer
  • Removed the use of ctags (could look like hidden keywords)
  • Changed forum signatures to be brand focused
  • Dropped signatures/links from dormant profiles
  • Stripped out backlinks from profile scraping websites
  • Disconnected my PageRank tool and its low quality backlinks
  • Shut down a single page website providing backlinks
  • Disavowed one website

But even after all of those changes, he only saw a slight boost in traffic much later with the Penguin 2.0 release. After that, he said he did a few more disavows and changed his hosting. I have a difficult time believing that either of those made the difference.

It's possible, but I have another theory.

According to the Way Back Machine, this was sitting in his sidebar just before Penguin 1.0 came out:

penguin recovery 04

I suppose that this is the RSS feed he was talking about. But all of these "activities" are links to external sites, and most of them aren't necessarily high quality. I'm not passing judgment on these sites. Most of them seem okay, as in "not spam." But it's easy to see how Penguin might interpret these as potential paid links.

Now, Tony did remove this almost immediately after the penalty. So why didn't he recover after the second Penguin update?

Well, this was sitting in his sidebar as late as August 5, 2013, which is after the Penguin 2.0 update:

penguin recovery 05

At some point between then and November 16, the box was removed. I have a feeling that it was removed before Penguin 2.1, and I have a feeling that this is why he recovered when Penguin 2.1 was released.

Here's why. This widget links to experts-exchange.com, a site that I will not be linking to, for reasons that should be immediately clear.

Because, it turns out, Google does not like experts-exchange.com. Search Engine Roundtable wrote a blog post about just how much Google doesn't like Experts Exchange back in November of 2011. In fact, it is one of the most blocked sites by Google's users.

As I've argued before, I don't believe that Penguin penalizes sites with spammy inbound link profiles. Instead, I believe that it penalizes sites with spammy outbound link profiles. This makes more sense from Google's perspective, for several reasons:

  • Google crawls the web forward, not backward. It doesn't make sense for Google to penalize one site and not another if they both receive links from the same spammy site. Nor does it make sense for Google to penalize every site that gets linked to by a spammy site. What makes most sense is to penalize the spammy site with the outbound links, or at least ignore its outbound links.
  • Matt Cutts said that the original Penguin algorithm only looked at the homepage of a site. But several case studies demonstrate that some sites have been hurt by Penguin even though the spammy links pointed everywhere but the home page. Unless Matt Cutts was lying, this means he was talking about the homepage of sites with spammy outbound links, not the homepage of sites with spammy inbound links.
  • Inserting negative links into the algorithm is risky business because it allows for negative SEO. The only way to perform negative SEO on outbound links would be to hack the site, and Google is already willing to block sites that have been hacked.
  • The risk of a false positive is too high.
  • Identifying one site with several spammy outbound links is a more effective way of impacting more spammers more quickly

I'm not sure that this one widget link to Experts Exchange would have normally resulted in a Penguin penalty. However, since he already had a sidebar filled with outbound links that could be interpreted as spammy, he was being held to higher standards. Removing all but one spammy link wasn't enough.

And This is Why Penguin Comes in Different Degrees

As far as I can tell, the only sites that are directly penalized by Penguin are the sites that host spammy outbound links on their site.

This has a secondary effect on sites with spammy inbound links. These sites are not directly penalized. Instead, the sites that link to them have been penalized, so their links become worthless. This causes your traffic levels to drop down, as though all of those links had been removed or no-followed.

If a site that links to you has been penalized, you are suffering from a "mild" case of Penguin. You lose link value, but you aren't directly penalized.

So Why Does Matt Cutts Say to Remove the Links?

I'm going to avoid getting cynical and assuming that Cutts just wants you to do his job for him. I think there's still a legitimate reason to remove links.

Let me explain.

Let's use Tony's site as an example. If I had to guess, I would say that his site was penalized because he linked to Experts Exchange, and had a sidebar filled with outbound links. The fact that he had a sidebar filled with outbound links probably wouldn't be enough to get hit by Penguin. The fact that he linked to Experts Exchange probably wouldn't typically be enough either.

But, since he did both, Google assumed that his sidebar was filled with spammy outbound links.

Now, speaking hypothetically, if another site linked to one of those sites found in his sidebar, Penguin might view that site with suspicion as well.

Let's put it this way. If a site has a sidebar filled with outbound links, but none of those sites has been labeled as spammy by Penguin, this site probably isn't going to get hit. But let's say that, elsewhere on the web, those links tend to show up right next to known spammy links. In that case, Penguin is going to see those links as suspicious. If you have a high enough percentage of those links on your site, Penguin is going to penalize you.

Now, if you have a spammy inbound link profile, and you start to clean it up, some of the other sites that link to you are going to be viewed with less suspicion. In some cases, it may be the straw that breaks the camel's back. A few of the sites that still link to you will no longer be seen as spammy, and their outbound link value will be restored, which causes your rankings to go up.

In short, if you have a spammy inbound link profile, the innocent sites that link to you will be viewed with suspicion, and some of them may even get penalized or lose their outbound link value. When you clean up your link profile, any remaining sites that link to you will be viewed with less suspicion. They may have their penalties lifted, or their outbound link value restored, as a result. This, of course, will cause your rankings to go up.

Ultimately, this means that link removal can help, but not in the way that we expect it to. More importantly, it means that we should be placing far more emphasis on earning authoritative, trustworthy links than we should be on removing links.