The evidence continues to mount for an alternative theory that Penguin doesn't work how you think it does.
The standard theory goes something like this: if your inbound link profile looks spammy (however that may be defined), you get penalized.
If your outbound link profile looks spammy, it either gets ignored, or in extreme cases, Penguin penalizes you.
Why do I suspect this is the case, and where's my evidence? Let's start with the evidence.
The Penguin 3.0 Recovery Case Studies
The strongest case for what makes Penguin tick is not what the site looks like, or what its link profile looks like, when it gets penalized. The strongest case is to take a look at changes that were made that led to a recovery.
They are gradual, and they don't happen on the date of a Penguin update. You can't claim that a Penguin penalty has been "lifted" if it's not on the date of a Penguin release. That kind of "recovery" is what happens when you just start building legitimate links, generating legitimately positive user behavior, and reaching a wider variety of queries by improving the amount of content on your site. Your link removal requests have got nothing to do with it. It's just SEO.
I believe that the majority of sites that have been negatively impacted by Penguin are not being directly penalized. Instead, they have had their link value removed. I've previously pointed to Spencer Haws site. The site had roughly 400 visitors per month before it started getting hit with negative SEO. The negative SEO links actually improved traffic for a while. Then his site was "hit" by Penguin. But the end result was that he had about 400 visitors per month:
This is, in my opinion, simply an example of link value being lost. There's no evidence that his site was actually directly penalized. Spencer has since redirected the site, so there is no way to know for sure what would have happened to his site with the Penguin 3.0 update. I would suspect, however, that all of his link removal would have been pointless, and that the site would have continued to receive the same level of traffic.
Occasional stark recoveries on the date of a Penguin release do happen occasionally, however. As I shared in a previous blog post, in at least one case, this seemed to be the result of removing a distrusted link from the sidebar. That's right, removing an outbound link:
In this example, the guy did tons of link removal and disavowing after getting hit by Penguin 1.0. No luck after doing that. He sat through 3 more Penguin updates and saw no improvement. Then, suddenly, he recovers with Penguin 2.1. After reviewing his site in the WayBack Machine, however, I noticed that he removed a link from his sidebar to experts-exchange.com, a site that has an established negative reputation with Google. I believe that removing this link is what caused his site to recover with the next Penguin update.
So the question is, are we seeing similar stories with Penguin 3.0?
Investigating the Cognitive SEO Recovery Stories
As usual, there aren't many public stories about Penguin recovery that actually share which site is involved, but a few are available. Cognitive SEO published a blog post drawing attention to a few sites that were hit by Penguin 3.0, as well as a few recovery stories. The first recovery story is for a site that currently redirects to a Facebook page, and that was actually unavailable when it was supposedly hit by the previous Penguin update, so it doesn't make for a very good case study.
Next up, however, we have xtremediesel.com.
Looking at the site in the WayBack Machine, here is what the site looked like on Sep 25, 2013, not long before the previous Penguin update. Take a look at the footer:
Now, look closer:
So, what we have here are hidden links. I don't believe these links were hidden intentionally. Most of these are navigational. A couple of them technically link to external websites, however. One to Facebook, and more importantly, one to their blog: blog.xdp.com. It's clearly owned by the same company, but it is on an entirely different domain. Combine that with hidden links and you've got yourself a false positive: Penguin thinks you're selling spammy links and hiding them in your footer by coloring them the same as the background.
This problem was corrected on or before December 6, 2013. It's my belief that correcting this issue is the reason that they experienced a stark recovery with the release of Penguin 3.0.
This is backed up by the fact that Cognitive SEO doesn't see very many suspicious inbound links to the site at all:
Only 5% of their inbound links are marked "unnatural" and only another 5% are marked "suspect." I find it difficult to believe that a site is being directly penalized by Penguin because a maximum of 10% of its inbound links are "unnatural."
I can't find any evidence that the next site in Cognitive SEO's blog post was placing any outbound links that could be interpreted as spammy, but it also didn't experience what I would call a stark recovery:
A modest recovery like this is characteristic of other sites getting hit by Penguin 3.0, giving itshot.com a slight boost as a result. It does not look like a Penguin penalty has been lifted. The same goes for their next example, roundgames.com.
But then we have shopworldkitchen.com:
Well, here's what that site's header looked like on September 25, 2013:
Can you guess what this header looks like if I select it with the cursor?
Seeing a pattern? I sure am.
In this case, none of these links appear to be external. It may be that Penguin doesn't care. Google may believe that any hidden link, intentional or not, external or internal, is a sign of a site that doesn't deserve to rank.
This problem was corrected on or before November 15, 2013. Again, I believe it was fixing this problem that caused the site to experience a full recovery with the release of Penguin 3.0.
Continuing down Cognitive SEO's blog post, we see another moderate "recovery" I'm not interested in investigating, and then a final example, watchcartoononline.com:
Watchcartoononline.com seems to have had a notorious pop-up ad problem according to reviews at mywot.com. A visit to the site in the WayBack Machine before the previous Penguin update suggests that clicking on any link anywhere on the site produced a pop-up ad. While the site still shows pop ups when clicking on the videos, as well as displays ads in various places throughout the site, it no longer pops up an ad every time you click a link anywhere on the site. There is a good possibility that this was the culprit, since deceptive or cloaked links are against Google's terms of service.
I have massive respect for Marc Enzor, founder of Geeks2You, for publicly sharing that his site had been hit by Penguin, and for announcing it's recovery with the Penguin 3.0 update:
The site was hit all the way back in May of 2012 by Penguin 1.1. Marc doesn't share any dates on when he started or finished his link removal and disavow process, but I doubt he waited until after Penguin 2.1, in October of 2013, to begin this process. Nevertheless, he didn't see a recovery until Penguin 3.0. Did the work just get missed by the other 3 Penguin updates? Did he not do enough of the removal beforehand?
Maybe, but I have another theory.
Before the Penguin 1.1 update, Geeks2You had a "links" page which listed 6 affiliate links. All of those links were "dofollow." Those links were still "dofollow" after the Penguin 2.1 update. But on October 13, 2014, not long before the release of Penguin 3.0, one of those links had been removed, and the rest were properly labeled as advertisements with the rel="nofollow" tag.
Still Think Penguin's Algorithm is Based on "Negative PageRank?"
I'll admit it. When I first started discussing this issue, my language was a bit more crass. I have more evidence today to back up my theory than ever before, but, ironically, I'm more willing to accept that I could be wrong. Experience has tempered my ambition.
But I have to say, every site I've looked at that's gone through a stark and sudden Penguin recovery so far has a similar story. At the time they were first hit by Penguin, they had on site issues such as hidden links, followed ad links, cloaked links, or links to distrusted websites. Those that experienced multiple Penguin updates before seeing a recovery did not fix their on site issues until before the final update, and they continued to remain penalized until that final update, despite massive link profile cleanup efforts. Some of these sites even seemed to have almost squeaky clean link profiles at the time they were penalized.
Sites that were "penalized" without on site issues like these likely weren't penalized directly. I believe they merely lost their inbound link value. It was cut out from under their feet because the sites that were selling links or allowing easy spam were either penalized, or had their outbound link value removed. Sites that recover from this kind of "penalty" very rarely see their "recovery" on the date of a Penguin release. The recovery is usually very gradual and appears to be the result of new links and content. If "negative PageRank" or direct penalties were hindering these sites, we would expect to see more dramatic recoveries with the release of a new Penguin update. At best, we see very modest recoveries for these types of sites, which are more easily explained by other sites getting penalized or negatively impacted by Penguin.
In previous posts, I've gone on at length for reasons why it doesn't make sense for Google to penalize sites for their inbound link profiles:
- Google crawls the web forward, not backward.
- It does not make sense to punish one site and not another if they both receive a spammy link from the same page.
- Policing efforts always work best when they punish the seller, not the buyer. History has born this out.
- All of Google's most effective and most publicized manual efforts against spam have targeted sites that were doing the "selling": MyBlogGuest, BuildMyRank, content farms, and other networks.
- When Matt Cutts warns against link transactions, he's almost always warning webmasters not to sell links.
- Matt Cutts has specifically said that, in most cases, if a site gets penalized and it links to you, it doesn't suppress your ability to rank, you simply lose some or all of the value of that link. While he clearly left room in his language for exceptions, to me it strongly suggests that the algorithm does not incorporate anything like "negative PageRank." The room for exceptions that he left I am quite sure was for manual penalties.
- Penguin can affect more sites more quickly by penalizing a single page with a large number of outbound spammy links than by targeting each of them individually.
- The risk of false positives is much lower when analyzing pages for spammy outbound links than spammy inbound links. This can't be done through negative SEO except in cases where the site is hacked or left vulnerable to easy link grabbing. Google has already made it clear that they are willing to penalize sites that have been hacked or that have left their sites open to spammy comments, etc.
- Google has no incentive to admit that Penguin doesn't penalize inbound link profiles or that it penalizes spammy outbound links.
- It is easy for a webmaster to control things like anchor text and avoid using anchor text that appears spammy. It is not easy and often impossible for a webmaster to keep link sellers from revealing themselves.
- Preventing PageRank from being passed forward harms link spammers without harming sites with relatively clean link profiles.
I'm open to alternative theories. The evidence I've accumulated so far could be coincidence, or subconsciously cherry-picked. If one of our clients were hit by Penguin, we would undoubtedly do some link removal and disavowal, even if just to guard against future manual penalties. But, as things stand, I strongly believe that if there is such a thing as "negative PageRank" anywhere in Google's algorithm, it can't be an especially strong signal, or we would be seeing more evidence for its existence. Likewise, I believe that if your inbound link profile was the problem, you should be focusing the lion's share of your resources on building reputable links and a sustainable marketing strategy, with only a small percentage of your efforts going towards link removal. I've never seen a case study encouraging enough to suggest anything different.
And finally, if you were hit by Penguin 3.0, I would say the very first place to look would be for anything onsite that could be interpreted as spammy. Penguin's promise always was to eliminate web spam. It was the SEO community who got link profile myopia.
Image credit: Chris Pearson