Sum of heights in a binary tree

Executive summary: every year when teaching data structures I always forget how to analyze the cost of building a binary heap, which amounts to summing the heights of all the nodes in a full binary tree. So I’m writing down the (lovely) proof here in the hopes that I will remember it next time.

Suppose you have a full binary tree and you do an operation on every node, where the cost of the operation is proportional to the height of that node. That is, the cost for each of the n/2 leaves is 0, for each of the n/4 nodes in the next level up the cost is 1, and so on. We can visualize the scenario like this:

As a function of the total number of nodes n, how expensive is this? We can see that O(n \lg n) is an upper bound, since there are n nodes and the height of each node is at most \lg n. But it seems like it might actually be faster than this in reality, since, intuitively, most of the nodes have a height which is much smaller than \lg n.

(One specific motivation for this scenario is that we can build a binary heap from an arbitrary set of data by looping over the nodes from the bottom up and calling reheapDown on each; in the worst case reheapDown takes time proportional to the height of the node, as in this scenario. But it doesn’t matter if you don’t know about binary heaps.)

Let’s take the same tree and put a dollar at every node, for a total of \$n:

Now imagine sliding all the money as far up and to the right as it will go. That is, we take each dollar, and keep moving it up as long as it is a left child. As soon as we reach a node which is a right child we stop. The tree ends up looking like this:

Now take each pile of money and move it up one step to its parent, except the money at the root of the tree, which you can put in your pocket.

And voilà! We now have exactly enough money at each node to pay for the cost of the operations, and we even have a bit left over (which we can use to buy coffee). But we started with \$n and only shuffled money around; this shows that the total cost is actually O(n).

Exercise for the reader: what does this have to do with the number of bit flips needed to count from 1 to n with a binary counter?


About Brent

Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
This entry was posted in math, teaching and tagged , , , , , , , . Bookmark the permalink.

2 Responses to Sum of heights in a binary tree

  1. Funny and elegant solution. I remembered the result from university, but now I can eaily remember even the “proof”.

  2. Pingback: Resumen de lecturas compartidas durante noviembre de 2017 | Vestigium

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.