Why Rand Fishkin Is Wrong About Correlation

Cara Bowles    By under SEO.

that would be great

...Okay, here we go...

Look, I don't like to pick internet fights for attention. I respect Rand Fishkin's opinions on a lot of things in the industry. I use Moz's little title tag tool practically every day. I've published posts at Moz. But there are times when an idea gets perpetuated and it just has to be put to a stop. What follows isn't about Rand, or Moz. Heck, in my opinion, it's also much, much bigger than a misconception in the SEO community.

So, Rand just published this blog post today, and in it he says this:

Today I'm going to make a crazy claim--that in modern SEO, there are times, situations, and types of analyses where correlation is actually MORE interesting and useful than causality.

Alright. A lot of interesting things come from correlation studies. They are a good jumping off point. They can tell us where causality might exist. But he is so very, very dead wrong about correlation ever being more interesting or useful than causality.

To give Rand his fair treatment, let me summarize the point of his blog post: SEOs should sometimes spend more time thinking about what Google is trying to rank, rather than specifically how. I want to be 100% clear before I move on: I don't disagree with that point. What I disagree with is what appears to be a fundamental misunderstanding of the differences between causation and correlation. I'm not just throwing an academic conniption fit here, either, and here's why.

The Misunderstanding That Follows Can Be Dangerous

The central problem is this: correlation studies are a poor way to guess what kinds of things Google wants to rank.

Here's something Rand said in the post that betrays his fundamental misunderstanding:

If many high-ranking sites in your field are offering mobile apps for Android and iOS, you may be tempted to think there's no point to considering an app-strategy just for SEO because, obviously, having an app doesn't make Google rank your site any higher. But what if those mobile apps are leading to more press coverage for those competitors, and more links to their site, and more direct visits to their webpages from those apps, and more search queries that include their brand names, and a hundred other things that Google maybe IS counting directly in their algorithm? [Emphasis mine]

Here's what Rand seems to miss. If putting an app up causes you to get more press, which causes you to get more links, and causes more people to visit your site, and causes people to search for your brand, and those actions in turn cause Google's algorithm to improve your rankings, this is a cause and effect relationship. 

What Rand describes here is not mere correlation, it's  indirect causation.

And he made this mistake more than once:

...what if those TV ads drive searches and clicks, which could lead directly to rankings?

Again, this is not just correlation, it's indirect causation.

So, what's the big deal? Isn't this all just a semantic debate? Don't I agree with him that we shouldn't obsess so much with figuring out how Google's algorithm works, since that's actually impossible? Yes? So what's the problem?

Here's The Problem (2 Big Problems, Actually)

1. When we confuse correlations with indirect causes, we stop caring whether our actions matter

Let me just come out and say this: the SEO industry will never unravel the increasingly complex mysteries of Google's algorithm.

That doesn't mean we can't identify things we can do that tend to improve a client's visibility in the search results.

The key is to focus on indirect causes.

In psychology, nobody pretends that they are ever going to discover a perfect therapy that will cure 100% of patients of depression. However, they recognize that their are things they can do that will be helpful for a certain percentage of their patients. They can conduct scientifically designed experiments, with control groups and everything, that demonstrate whether one particular therapy works better than another on average.

This is the power of indirect cause. Psychologists need not understand everything about how an ever changing human brain works to discover that certain therapies work better than others.

SEO is similar. We may not have clean control groups, but we can measure impact. We can identify strategies that seem to work better than others. We can test.

Correlation studies are an easy out. They let us stop caring whether what we do actually works, and instead just copy what successful people have done. We don't care why those people were successful, and we don't ask ourselves whether it makes sense to copy them, and we don't ask whether the things they did actually contributed to their success in the first place.

At the end of the day, those correlation studies are what really convinced people that they needed to flood their site with spammy links. When some SEOs saw correlation studies showing how highly links were correlated with rankings, they spammed their sites. In some cases, they actually thought that's what Google wanted, because the correlation was there.

2. When we confuse correlations with indirect causes, we screw up cause and effect

This is a big one too. Let me give you some examples.

  • The correlation studies almost certainly overvalue the importance of links. Why? Because a site that ranks well in search results is also going to get linked to more often than a site which doesn't rank well. I'm not saying links don't help your rankings: of course they do. But rankings also cause people to link to you, so the strength of links as a factor is overrated when you just look at correlation.

fry rankings

  • We already know that Google doesn't use social media factors in the algorithm, but the correlation with rankings is high. Why? At least in part, it's because high rankings will cause more people to see your content, and some of them will share it on social networks. To some extent, it may also be because a social media presence helps create signals and indirectly causes your rankings to improve. It's dangerous, though, to rush to the conclusion that because this correlation exists, Google probably wants to rank the kind of site that will do well on social media. In fact, I think that's dead wrong, because Google loves Wikipedia and academic sites, and they're definitely not social.

We don't just get cause and effect backwards, either. I can guarantee that a lot of the factors we think are important only correlate with rankings because marketers are using them.

If marketers are using strategy X, and non-marketers are not using strategy X, and marketers tend to rank better than non-marketers, we will see strategy X correlate with rankings. In that case, all it means is that marketers use strategy X. It's not necessarily an indirect cause for or an indirect effect of rankings.

Cause and effect, indirect or otherwise, are always more important than correlation.

Postscript: Michael Martinez also wrote a good post on the topic.