Traversing an unusual tree in Python

Question

I have an unusual tree array like this:

[[0, 1], [1, 2], [2, 3], [2, 4], [2, 5], [5, 6], [4, 6], [3, 6], [0, 7], [7, 6], [8, 9], [9, 6]]

Each element of the array is a pair, which means second one is a follower of the first, e.g.:

[0, 1] - 0 is followed by 1 [1, 2] - 1 is followed by 2

I am trying to extract array like this:

0 1 2 3 6 0 1 2 4 6 0 1 2 5 6 0 7 6 8 9 6

I couldn't code a robust traversal to extract all possible paths like this. How can I do it with Python?

How important is performance? How many elements could be in the tree? — Mark Byers
– Mark Byers, Commented Jan 11, 2010 at 15:58
In your example, if two nodes are in the same tree, the higher one comes before the lower in the list. Can we assume that this is always the case? — Mark Byers
– Mark Byers, Commented Jan 11, 2010 at 16:00
Please note that the "tree" in your example is not unusual. It is no tree at all, but a kind of directed acyclic graph (DAG) referred to as polytree (check wikipedia for the details). — Frank
– Frank, Commented Jan 11, 2010 at 16:20
The tree would make a lot more sense if the edges were undirected. Otherwise, the idea of traversing it from a root node is meaningless since there is no single well defined root here. Are you sure that if A 'follows' B, B does not also follow A? — MAK
– MAK, Commented Jan 11, 2010 at 16:24

Mark Byers · Accepted Answer · 2010-01-11 16:34:43Z

You could do it using a recursive generator function. I assume that the root node in the tree always comes before all its children in the original list.

tree = [[0, 1], [1, 2], [2, 3], [2, 4], [2, 5], [5, 6], [4, 6], [3, 6], [0, 7], [7, 6], [8, 9], [9, 6]] paths = {} for t in tree: if t[0] not in paths: paths[t[0]] = [] paths[t[0]].append(tuple(t)) used = set() def get_paths(node): if node[1] in paths: for next_node in paths[node[1]]: used.add(next_node) for path in get_paths(next_node): yield [node[0]] + path else: yield [node[0], node[1]] for node in tree: if tuple(node) in used: continue for path in get_paths(node): print path

Output:

[0, 1, 2, 3, 6] [0, 1, 2, 4, 6] [0, 1, 2, 5, 6] [0, 7, 6] [8, 9, 6]

Explanation: First I construct a list of all possible paths from each node. Then for each node that I haven't used yet I assume it is a root node and recursively find which paths lead from it. If no paths are found from any node, it is a leaf node and I stop the recursion and return the path found.

If the assumption about the order of the nodes does not hold then you would first have to find the set of all root nodes. This can be done by finding all nodes that do not appear as the second node in any connection.

As you already mentioned about assumptions, I gave rui's answer +1 because in his case even if I change the order [0, 7], [7, 6], [8, 9], [9, 6] to say [9, 6], [7, 6], [0, 7], [8, 9] it still doesn't alter the result which is pretty nice! In some cases we need that when order is not known ahead of time.

MAK · Accepted Answer · 2010-01-11 16:09:54Z

From what I understand of your question, it looks like you have a set of parent-child relationships as a list of pairs that describes a tree. You seem to be running into trouble by thinking that it has a structure like a linked list. Unlike a linked list, a tree is a more general form, it can have multiple nodes that 'follow' a given node that are called its children.

The easiest way is to simply build the tree first and then traverse it from the root. Define a Node class that has two fields, one for the value of the node and the other a list of children. Then you iterate over the items of your list adding the second element of each pair to the children list of node corresponding to the first element of the pair. After the tree is built, you use a recursive print function that prints the current node and calls itself on its children (if there are any). Calling the function on the root node should print the whole tree.

I would post some code, but this looks a lot like homework. The above explanation should be enough for a start.

Good answer. Looking at the above example though, I additionally suggest to keep a third field in the Node class that points to the parent. The problem is that in the above example 0 and 8 are both roots. As there are multiple paths to the node 6 it also isn't a tree at all, but in fact a DAG. Therefore, there can be multiple roots and the additional field allows you to iterate through them.

Michilus · Accepted Answer · 2010-01-11 16:29:09Z

The easiest way I can think of, would be to construct a dictionary that contains all possible children for a given parent, like so:

d = {} for parent, child in tree: try: d[parent].append(child) except KeyError: d[parent] = [child]

with tree = [[0, 1], [1, 2], [2, 3], [2, 4], [2, 5], [5, 6], [4, 6], [3, 6], [0, 7], [7, 6], [8, 9], [9, 6]], this would produce:

{0: [1, 7], 1: [2], 2: [3, 4, 5], 3: [6], 4: [6], 5: [6], 7: [6], 8: [9], 9: [6]}

Now it's possible to recursively traverse the tree like this:

def printPaths(d, currentPath): if currentPath[-1] not in d: print currentPath # last node can't possibly be a parent, so stop else: for child in d[currentPath[-1]]: printPaths(d, currentPath + [child]) for root in d: printPaths(d, [root])

I haven't tested the recursion, but it should give you an idea :)

rui · Accepted Answer · 2010-01-11 16:21:32Z

Here you go. Not the nicest code on earth but it works:

inputValues = [[0, 1], [1, 2], [2, 3], [2, 4], [2, 5], [5, 6], [4, 6], [3, 6], [0, 7], [7, 6], [8, 9], [9, 6]] tree = {} numberOfChildren = {} for (f, t) in inputValues: if not tree.has_key(f): tree[f] = [] tree[f].append(t) if not numberOfChildren.has_key(t): numberOfChildren[t] = 0 numberOfChildren[t] += 1 roots = [c for c in tree if c not in numberOfChildren] permutations = [] def findPermutations(node, currentList): global tree global permutations if not tree.has_key(node): permutations.append(currentList) return for child in tree[node]: l = list() l.extend(currentList) l.append(child) findPermutations(child, l) for r in roots: findPermutations(r, [r]) print permutations

This is good because even if I change the order [0, 1], [1, 2], [2, 3], ... to say [1, 2], [2, 3],[0, 1],... it still doesn't alter the result which is pretty nice!

Gordon Seidoh Worley · Accepted Answer · 2010-01-11 16:12:40Z

Looking at the problem, it seems the best approach might be to build the arrays backwards over several iterations. My idea is this, but note we have to assume that this is a tree and so the leaves can only be used once:

Let arrays = list of pairs
Until every array in arrays are leaf:
1. If an array is a leaf (last element isn't the first element in any array in arrays):
  1. For each array in arrays see if the leaf can be attached on to the end of it
  2. After attaching to all possible arrays, delete the leaf

Obviously you'll have to do some work to turn that into code, but that's a rough idea.

tttppp · Accepted Answer · 2010-01-11 16:15:31Z

You could use the find_all_paths function from the following page: http://www.python.org/doc/essays/graphs/

In order to use this you need to make two minor tweeks to your graph. First, loop through your list of edges and create a new representation of the graph like:

 graph = {0: [1, 7], 1: [2], 2: [3, 4, 5], ...}

Secondly create a supersink (in your example case you could call it 10) and attach all vertices with no edges leading from them to this new node.

Then you can call the function find_all_paths(graph, 0, 10) to find all such paths.

sth · Accepted Answer · 2010-01-11 16:35:08Z

Produce all longest pathes from all possible starting nodes:

tree = [[0, 1], [1, 2], [2, 3], ...] dtree = {} for (k, v) in tree: dtree.setdefault(k, []).append(v) parts = [[root] for root in range(10)] while parts: path = parts.pop(0) if path[-1] in dtree: for n in dtree[path[-1]]: parts.append(path + [n]) else: print path

If it should only produce paths that are not part of a different, longer path starting at some other node, parts would need to be initialized to all nodes not contained in [p[1] for p in tree]. And if you want all paths instead, not just the longest ones, there should be a print in every iteration of the while-loop.

ondra · Accepted Answer · 2010-01-11 17:22:01Z

The following works - generate the trees starting from root. The roots are considered the nodes that do not have a parent.

import operator def genpaths(data): # Initialize dictionary ddata = {} for item in data: ddata.setdefault(item[0], []).append(item[1]) def genpath(root): "Generate paths starting with root" if root not in ddata: yield (root, ) else: for child in ddata[root]: for path in genpath(child): yield (root, ) + path for root in set(ddata.keys()) - set(reduce(operator.add, ddata.values())): for path in genpath(root): print path

Collectives™ on Stack Overflow

Traversing an unusual tree in Python

8 Answers 8

1 Comment

1 Comment

Comments

1 Comment

Comments

Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

1 Comment

1 Comment

Comments

1 Comment

Comments

Comments

Comments

Comments

Related