3
$\begingroup$

I'm trying to understand how vector clocks compare to DAGs for causality detection in a distributed system.

When trying to detect causality relations in a distributed system, a very commonly proposed technique is the use vector clocks. Vector clocks are presented as:

Timestamping is a common method of totally ordering events in concurrent programs. However, for applications requiring access to the global state, a total ordering is inappropriate. This paper presents algorithms for timestamping events in both synchronous and asynchronous message-passing programs that allow for access to the partial ordering inherent in a parallel system. The algorithms do not change the communications graph or require a central timestamp issuing authority.

Source: Timestamps in Message-Passing Systems That Preserve the Partial Ordering Colin J. Fidge

Another less mentioned but nonetheless ubiquitous method is to synchronize a directed acyclic graph (DAG) between the replicas. I understand the "reachability" notion of DAGs allows to define a partial order between events. To my understanding, this is Git's approach to determining if a commit took place before, after, or at the same time as another.

I've seen little to no discussion opposing vector clocks to full DAG modeling. Unless I missed anything I think those are techniques that both could be used to do an event partial ordering in a distributed system. Trying to understand what unifies or separates vector clocks and DAGs, I've come to the following conclusion so far:

What unifies them:

  1. They both could be used to determine causality relation in a distributed system

Advantages of vector clocks:

  1. Size of data to synchronize in vector clock grows with the number of replica, but not with the number of events; which in most real life application is more efficient. Also optimizing DAG data transfer between replica over the network may not be straightforward.

Advantage of DAGs:

  1. Easier to conceptualize? And possibly implement?
  2. Models the full tree, so wins over vector clocks when we want not only to understand if an event e1 occurred before another event e2, but want to access more details on the event present, possibly their content, etc.
  3. More expressive than vector clocks because vector clocks DAGs are constructed with a specific algorithm that constrains the expressiveness, and there are things you cannot do, such as taking three or more previous events to build one new event (at least from my understanding of the original paper, but that limitation may be challenged)

Is this it, or am I missing an important concept?

$\endgroup$
1
  • $\begingroup$ You write “Another less mentioned but nonetheless ubiquitous method is to synchronize a directed acyclic graph (DAG) between the replicas.” - did you find any mention? I struggle to find anything except this stackexchange question related in any way to this topic. I only know of a tangent in M. Raynal and M. Singhal, "Logical time: capturing causality in distributed systems," in Computer, 1996, doi: 10.1109/2.485846. saying that DAGs could also be used to “ensure livenes” instead of logical clocks, but the referenced paper uses DAGs for precedence, not for discrete time. $\endgroup$ Commented Nov 22, 2022 at 9:47

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.