Is Dividing Your PageRank Really a Bad Thing? (With Math!)

on under Myth-Busting, SEO Auditing.

If there is just one piece of jargon that has destroyed the SEO industry's understanding of PageRank, it's got to be "link juice."

Never have two words uttered is succession created so many misconceptions.

(In shame, I have to admit I've probably used this evil incantation at some point in the past.)

Here's the truth: PageRank is counterintuitive, so counterintuitive that most people who try to "sculpt it," one way or another, will end up shooting themselves in the foot. And I'm not saying that because I believe they're going to get penalized. I'm saying that because they simply don't understand how PageRank works.

Here's what made me want to write this post.

I was just reading through Brandon Buttars' post on categories and tags over at Avalaunch Media. In general, I agree with his advice. But he said something that made me stop and think:

Each page of your site starts with 100% of the page's total link value and that value is divided among the links on the page. To get the most out of the links on your page you want to minimize the link bleeding.  Link bleeding refers to link value being sent to worthless pages like your contact page, about page, etc. Every link on the page decreases the total value passed through each link, so less links adds more value to each link.

What Brandon is saying is true on the micro-scale. Unlike later ranking factors, Google has always been very upfront about how the (original) PageRank algorithm works. Each page passes about 85% of its PageRank forward, to the pages it links to. That PageRank is divided equally between the links. So, if it links to two pages, the PageRank is split in half. If it links to a third page, the PageRank is split into thirds, meaning each page gets less PageRank.

The knee-jerk reaction, then, is to assume that if you put more links on the pages of your site, you're going to end up "bleeding link juice" at the expense of other pages on your site.

Is that always true?

No, it turns out, it isn't always true.

Well, it sort of is, and it sort of isn't. It depends. But I can show you how it works, and I can prove it with math.

Link Diagrams: Examing How PageRank Works

To answer the question of whether dividing PageRank across your site actually hurts PageRank, we need to revist how PageRank works. I'm going to do this visually, because it's not an easy thing to think about.

Let's start with the simplest non-trivial link diagram, one page linking to another page:

pagerank 01

We assume that the page at the top inherits 1 "unit" of PageRank from inbound links. Technically, every page also inherits a tiny, tiny portion of PageRank, even if it has no inbound links, but this portion is so small that we can ignore it.

When this page links to the page below it, the PageRank that it passes forward is multiplied by the damping factor, d, which we are told is usually 85%. The next page, then, gets dx1=d Pagerank, or 85%.

So far, nothing particularly surprising. Okay, how about if it's passed on to multiple pages?

pagerank 02

Alright, nothing new there either. On top of the damping factor, the PageRank is divided among the three pages, so that each inherits d/3 PageRank.

How about something a little more interesting? Something that just might be a game changer, actually:

pagerankk 03

Uh oh, here comes the algebra.

See, when a page links to a page that links back to the first page, things start to look a bit different. The PageRank "snowballs." Now the main page doesn't just inherit 1 unit of PageRank from its other links. It also inherits back some of the PageRank it sent to the other page. A bit of algebra tells us x, the PageRank of the main page:

x = 1 + d2x

x = 1/(1-d2)

if d = 0.85:

x = 3.6

dx=3.06

Yes, you read that right. If the subpage links back to the main page, the main page no longer has 1 unit of PageRank. It has 3.6 units of PageRank. As a result, the subpage now inherits 3.06 units of PageRank. Here we learn our first lesson: never throw away PageRank by linking to a dead end.

Some people are under the impression that you can "capture" PageRank by pointing links to a page that you want to rank, then making sure that page doesn't link to anything. Here we see the danger of the "link juice" analogy. If PageRank worked the way it did in the original algorithm, you could end up throwing away as much as 72% of your PageRank.

Of course, there's a very good chance all of this has changed anyway (don't forget it!), but the goal of this discussion is to address misconceptions about PageRank as we understand it, not to speculate about how it may have changed since.

 A Look At Two Extremes: Link Trees and Link Clusters

Now that we have some idea of how these link diagrams work, I want to look at two extremes for internal link architecture: what I'm going to call link trees and link clusters. Neither one of these would be a good design for anything other than a small site, but together they help us understand something important about how PageRank works.

Here is my conception of a link tree:

pagerank 04

 

Starting to look a bit cluttered, isn't it?

Here's the important thing to notice.

The only difference between this diagram and the last one is that there are three subpages instead of one. And, it turns out, that doesn't make a very big difference. Look at all the arrows pointing back at the homepage. Take a look at what that adds up to:

x = 1 + d2x/3 + d2x/3 + d2x/3

and guess what that adds up to?

x = 1 + d2x

Look familiar? It's exactly the same as it was last time. 

As you can probably guess, this scales as large as you want it to. It doesn't matter how many subpages you have. If the only page they link back to is the homepage, the homepage always inherits 3.6 times as much PageRank as it would on its own.

From the perspective of the homepage, it doesn't matter at all how many times you divide your PageRank among subpages.

Of course, from the perspective of the subpages, it does matter. The more subpages you add, the more you divide the PageRank for each subpage.

Alright, now let's take a look at link clusters:

pagerank 05

For obvious reasons, I haven't included the values being passed through each link. My conception of a link cluster, a completely flat architecture, is one where each page links to every other page.

If you think carefully about how this is set up, you will come up with this system of equations:

x = 1 + dy

y = dx/n + [(n-1)dy]/n

Which gets these answers:

x = (dn-d-n)/[(d-1)(d+n)]

y = -d/[(d-1)(d+n)]

[Edit: I neglected to mention in my first draft that here represents the number of subpages.]

These results are anything but self explanatory, so let's take a look at some graphs comparing the results with the tree structure:

pagerank 09

So, as we might have expected, adding more subpages does in fact reduce the PageRank of the main page with this link structure. But take a look at what happens with the subpages:

pagerank 10

That's right. Interlinking all of the pages is actually increasing the PageRank for the subpages.

PageRank Doesn't Follow Intuition

By looking at these two extremes, we can see that PageRank doesn't always follow our intuition when we try to fit it into metaphors like "link juice." While increasing the number of outbound links does divide the PageRank leaving each individual page, the effects on the site as a whole can be counterintuitive:

  • Linking to a dead end page doesn't "hoard" PageRank. It only causes you to throw away PageRank.
  • Pages don't lose PageRank through outbound links. In fact, if those pages link back to it, they increase it's PageRank.
  • Higher pages don't necessarily lose much (or any) PageRank as more lower pages are added, depending on how the links are structured.
  • A high amount of interlinking between subpages will indeed cause main pages to approach the same amount of PageRank as they would have if they stood on their own. However, this interlinking actually increases the PageRank of the subpages, rather than diminishing them.

With this in mind, it should go without saying that relevance and ease of navigation should come long before thoughts of "bleeding your link juice."

  •  
  •  
  •  
  •  
  •  
  • Hugo

    Hi Carter!! Thank you for this spectacular post. Just wanted to ask 2 things:
    1.What about internal nofollow links? I actually assume that they do not pass any Pagerank to the linked page, but they do waste it from the linking page, rigth? In that case, should we AVOID internal nofollow linking?
    2.In a large and deep Business directory site (5M indexed pages with almost no duplicates), I am forced to link many times to non-indexable pages (one short selling pages for each company that only differentiate from each other in a tiny peace of info, so they have noindex,follow) and I would like to know if I could re-use their achieved Pagerank by linking back to their respective superior company page?

    • Carter Bowles

      1. Based on what Matt Cutts has told us, yes, nofollow links still divide PageRank, but don't pass it forward, so you should essentially never nofollow your internal links. There's some speculation that nofollow links might impact rankings, and there are some circumstances where they probably do (Google guessing that the nofollow isn't intentional), but I would never count on that happening.

      2. If they are noindex, follow, then you can recover the PageRank by linking back to the superior page, yes, and based on the publicly available PageRank algorithm, that should actually increase the PageRank for the superior page (as discussed above). I think in all likelihood Google has since updated the algorithm so that small link loops like those don't snowball PageRank like they used to, but in any case it certainly doesn't hurt, and is always better than linking to a dead end.

      • Hugo

        In order to INCREASE PAGERANK through company pages -the ones that I need to rank- and UNBURY THEM, I was thinking about including horizontal links pointing to related companies -same level in information architecture.

        Our database is about 4.2M pages and I was thinking about including in each company page, a module of "related companies" with around 10-20 other companies.

        It will become a matter of how often (or how many times) we link to each individual company and how it WILL AFFECT PAGERANK FLOW. If I have a bottom line of 4.2M company pages and I place 10 random "related companies" links in each, I'll finally have 42M internal/horizontal links with no dead points.

        How would you do it? Should I make those links in "related companies" modules RANDOM?(so every company is linked equally).

        Thank you

        • Carter Bowles

          I wouldn't recommend linking to 10 other random pages. That's overestimating the importance of PageRank to the algorithm. Relevance is very important. I would make sure that the links are to similar companies.

  • Ben Morel

    A nice post, and the statistical effects on PR passing of "link trees" are very interesting. Have you investigated what happens when one of those pages links out without a reciprocal, effectively acting as a PR sink?

    Also, one slight niggle. You say that "PageRank is divided equally between the links" and while this "random surfer" model was the one that inspired the original paper Google's actually always been built on a "preferential surfer" model. This means that some links have higher/lower damping factors than others - although this won't change the results of the algebra for a "link tree" it definitely will for "link clusters" for networks with non-reciprocated links.

    • Carter Bowles

      If I'm interpreting what you're saying correctly, then you're talking about a subpage that links away from the site, without linking back to the superior page, right? If so, the impact on internal PageRank would be identical to linking to a dead end: no better no worse.

      (It's worth noting that linking to other sites can result in getting a link back, which if it isn't an intentional link scheme, can help your rankings. Matt Cutts has also said, vaguely: "In the same way that Google trusts sites less when they link to spammy sites or bad neighborhoods, parts of our system encourage links to good sites.")

      On your second point, yeah, I was careful to point out that the PageRank algorithm here is the original algorithm, which has since been modified in countless ways. It's difficult to guess at exactly how that has changed things, so I avoided putting any speculation into the post as far as that goes.