Data structure for finding the sum of edge weights on a path

Question

Let $T$ be a tree, and there is a weight function on the edges $w:E\to X$. $(X,\oplus)$ is a monoid structure.

Define $f(u,v) = \bigoplus_{i=1}^k w(e_i)$, where $e_1,\ldots,e_k$ is the unique path from $u$ to $v$.

Can we preprocess the tree, such that we can use linear space (or close to linear space), so we can answer queries $f(u,v)$ in $O(\mathrm{polylog} (n))$ monoid operations? The problem become more interesting if we allow removing edges and add edge between two vertices in distinct trees.

We can also ask the question on a DAG, with weights from a commutative monoid. We want to preprocess in linear time, and query the result of $f(u,v)= \sum_{e} w(e)$, where $e$ is in some path from $u$ to $v$, in $O(\mathrm{polylog} (n))$ time.

If we consider the very special case where the entire graph is just a path, then what we want is a dynamic version of a Fenwick tree. A finger tree can solve the problem in $O(\log n)$ time.

Do you know anything about the height of the tree? Are you guaranteed that the tree is balanced, or that its height is $O(\lg n)$, or something like that? In the case of a DAG, do you have an upper bound on the length of the longest path? — D.W.
– D.W. ♦, Commented Nov 4, 2013 at 19:03
There is no guarantee of heights. It's a general tree not a rooted tree, so in some sense height doesn't mean anything. (Of course, one can replace height with diameter). Note when the diameter is $n$(just a path), we know how to do it in $O(n)$ preprocessing time and $O(\log n)$ query time, so I think maybe there is a solution where the diameter is not involved at all. — Chao Xu
– Chao Xu, Commented Nov 4, 2013 at 21:25

D.W. · Accepted Answer · 2013-11-04 19:51:46Z

Static case. Suppose the tree is static. Let $d$ denote the average depth of the nodes of the tree, taken over all nodes. Then you can solve this problem with $O(nd)$ preprocessing, a data structure of size $O(nd)$, and query time of $O(1)$. As a consequence, if the tree is balanced or of height $O(\lg n)$, the preprocessing and data structure size are $O(n \lg n)$ and the query time is $O(1)$.

Here's how. My solution is inspired by Yuval's, but takes into account that we do not have an inverse. Basically, you just build a map that stores mappings $(u,v) \mapsto f(u,v)$, where $u,v$ range over all pairs of tree nodes such that $u$ is an ancestor of $v$. This has $O(nd)$ entries and can be computed in $O(nd)$ time during preprocessing. Also, precompute a data structure to help you solve the least common ancestor problem. Now, to compute $f(u,v)$ where $u,v$ are an arbitrary pair of nodes (not necessarily an ancestor of the other), find their least common ancestor $p$ and compute $f(u,v) = f(p,u) + f(p,v)$ (with the convention that $f(v,v)=0$ for all $v$). Thus, queries can be solved in $O(1)$ time, using a sufficiently good algorithm for LCA.

It's also possible to get slightly better efficiency for the above scenario. Instead of storing all mappings $(u,v) \mapsto f(u,v)$ where $u$ is an ancestor of $v$, we'll store just a subset of them.

Let $r$ denote the root. Focusing on a single leaf $\ell$, we have a path from $r$ to $\ell$; if the leaf is at depth $d$, this path has length $d$. So, we can build a balanced binary tree $T_\ell$ (akin to a Fenwick tree) over this path, where each node of $T_\ell$ corresponds to a mapping $(u,v) \mapsto f(u,v)$ (where $u$ is an ancestor of $v$ in the original tree $T$). The root of $T_\ell$ is the mapping $(r,\ell) \mapsto f(r,\ell)$, corresponding to a path of length $d$. It has two children, $(r,m) \mapsto f(r,m)$ and $f(m,\ell) \mapsto f(m,\ell)$, corresponding to two paths of length about $d/2$ (here $m$ is a node halfway along the path from $r$ to $\ell$). Each leaf of $T_\ell$ represents a single edge in the path from $r$ to $\ell$. In this way, we get a tree $T_\ell$ of size $O(d)$, where $d$ is the depth of $\ell$ in the original tree $T$.

Now take the union of the $T_\ell$, over all leaves $\ell$ of the original tree $T$ (sharing common nodes in these trees). This gives us a map that stores mappings $(u,v) \mapsto f(u,v)$, but now with a smaller subset of entries in the mapping than before. Given any pair of nodes $p,v$ where $p$ is an ancestor of $v$, we can compute $f(p,v)$ in time $O(\lg d)$, where $d$ is the depth of $v$. Therefore, we can compute $f(u,v)$ for arbitrary $u,v$ in time $O(\lg d)$, where $d$ is the depth of the deeper of $u,v$.

With this "forest of trees" scheme, the time to answer a single query is slightly more, but the size of the data structure is slightly less. I don't know if this will be a win in practice, but it might be. This scheme also has the advantage that, if the weight on an edge of $T$ is updated, you can fairly efficiently update the data structure.

There might be a smarter way to choose what subset of entries $(u,v)\mapsto f(u,v)$ to store in your data structure -- I'm not sure.

It's possible to extend these to the case of a static DAG, if you are promised that the length of the longest path in the DAG is not too large (if it is short and bushy/wide rather than long and narrow).

Yuval Filmus · Accepted Answer · 2013-11-04 16:53:30Z

On a tree you can reduce your problem to calculation of least common ancestor. For each node $v$, store the total weight $w(v)$ of the path from $v$ to the root; this can be calculated in linear time using DFS or BFS. Now given two nodes $u,v$, find the common ancestor $p$ and compute $f(u,v)=w(u)+w(v)-w(p)$.

ahh, true, but it is in a monoid and there might not be a inverse, having inverse speed things up by a lot. — Chao Xu
– Chao Xu, Commented Nov 4, 2013 at 18:00

Stack Exchange Network

Data structure for finding the sum of edge weights on a path

2 Answers 2

Linked

Hot Network Questions

Data structure for finding the sum of edge weights on a path

2 Answers 2

Linked

Related

Hot Network Questions