1. Introduction

In this tutorial, we’ll show how to find the central nodes of a tree.

2. Tree Centers

The graph theory defines a tree center as the node with minimal eccentricity (the length of the longest path to any other node):

Tree Centers - Example

On the left, we have an undirected and unrooted tree. On the right, we see the same tree rooted at its center.

That’s why we care about tree centers. Sometimes, we are given a tree as a regular graph, so no node is marked as the root. Our job is to root the tree, and doing so at the central node pays off in many applications.

We’ll assume that the trees we’re working with are unweighted, undirected, and unrooted.

2.1. Center and Diameter

The diameters of a tree are the longest paths in it. As we saw, there can be more than one. We can prove that the center of a tree must lie in the middle of its diameter(s).

Let x and y be the endpoints of a diameter containing an odd number of nodes, and let z be its middle node.

Let another node z' \neq z be the tree’s center and w the diameter node closest to z':The center is in the middle of the diameter

Depending on whether w is between z and y or z and x, either the path (x, \ldots,w, \ldots, z') is longer than (x, \ldots, z) or (y, \ldots, w, \ldots, z') than (y, \ldots, z). That’s impossible since, by our assumption, z' has the minimum eccentricity. So, z' can’t be the center, which means that z must be.

If the diameter contains an even number of nodes, there are two middle nodes, and both will be the tree’s centers.

3. Algorithm

That gives us an idea of how to find the center(s) in unweighted trees.

Since central nodes are in the middle of a diameter, we can start by removing the endpoint nodes one layer at a time until we reach the middle nodes. 

Each node in the outmost layer has only one neighbor, which we can call its parent. If we remove the surface nodes, their parents will have only one neighbor. As we move toward the center, the eccentricity decreases by one since all the edges have the same cost (1). So, we repeat this process by moving inward until we get one or two central nodes:

algorithm FindingCenterOfUnweightedTree(T):
    // INPUT
    //    T = an undirected, unrooted, and unweighted tree
    // OUTPUT
    //    The center of T if T is not empty, an empty set otherwise

    if T is empty:
        return empty set

    degrees <- the nodes degrees
    surface <- find the nodes with only 1 neighbor

    while |surface| > 2:
        inner <- empty set

        for node in surface:
            parent <- the only neighbor of node
            degrees[parent] <- degrees[parent] - 1
            if degrees[parent] = 1:
                inner <- inner union {parent}

        surface <- inner

    return surface

Although we assume the tree is unrooted, this algorithm implicitly defines its structure. The nodes it processes in the same iteration belong to the same level. Those processed in iteration i+1 are the parents of the nodes handled in iteration i, and the tree’s center becomes its root.

3.1. Example

Let’s run an example. In the first iteration, the surface nodes are A, G, H, I, and K:

Finding the center - step 1

In the second one, those are B, J, and F:

Finding the center - step 2

We process C and E in the third iteration and leave the loop with D as the found center.

3.2. Correctness

Let T_i be the tree we get by removing the ith iteration’s surface layer from T_{i-1}. Those trees are implicit: we don’t construct them but can derive them during the algorithm’s execution.

Our loop invariant is that \boldsymbol{T_i} has the same center nodes as the input tree \boldsymbol{T}. Let’s prove it by induction.

The base case is i=0, i.e., the situation before the main loop, which trivially holds: T has the same centers as itself.

Let’s assume that the invariant is true at the end of the ith iteration. Does it hold at the end of the iteration i+1? In other words, do T_{i+1} and T_i have the same centers?

We can show that’s the case by noting that a diameter’s endpoints have the degree one. If we remove such nodes from T_i, all the diameters will shrink by a node from both sides. So, those in the middle will stay the same. As a result, T_{i+1} and T_i have the same centers.

Consequently, the nodes we end up with are T‘s central nodes. There can be two centers at most because each diameter has either one or two middle nodes. Furthermore, we’ve shown that the diameters intersect at the centers!

3.3. Complexity

No node in an m-ary tree has more than m+1 neighbors. In the worst case, a node’s edges are in a linked list, so we need to traverse it to find its degree. In such a tree with n nodes, computing the degrees takes O(mn) time. If m is a constant, that will reduce to O(n). Similarly, if we store neighbors in an array, we can read its length in O(1) time, which gives the overall complexity of O(n).

Further, each edge is processed in the main loop only once. Since a tree with n nodes has n-1 edges, the loop’s time complexity is O(n).

So, this algorithm runs in linear time.

4. Conclusion

In this article, we showed how to find an unweighted tree’s central nodes in linear time. They are in the middle of the longest path(s), and there are at most two centers in each tree.


« 上一篇: 条件语句
» 下一篇: 稀疏编码神经网络