A version vector is a mechanism for tracking changes to data in a distributed system, where multiple agents might update the data at different times. The version vector allows the participants to determine if one update preceded another (happened-before), followed it, or if the two updates happened concurrently (and therefore might conflict with each other). In this way, version vectors enable causality tracking among data replicas and are a basic mechanism for optimistic replication. In mathematical terms, the version vector generates a preorder that tracks the events that precede, and may therefore influence, later updates.

Version vectors maintain state identical to that in a vector clock, but the update rules differ slightly; in this example, replicas can either experience local updates (e.g., the user editing a file on the local node), or can synchronize with another replica:

Initially all vector counters are zero.
Each time a replica experiences a local update event, it increments its own counter in the vector by one.
Each time two replicas $a$ and $b$ synchronize, they both set the elements in their copy of the vector to the maximum of the element across both counters: $V_{a}[x]=V_{b}[x]=\max(V_{a}[x],V_{b}[x])$ . After synchronization, the two replicas have identical version vectors.

Pairs of replicas, $a$ , $b$ , can be compared by inspecting their version vectors and determined to be either: identical ( $a=b$ ), concurrent ( $a\parallel b$ ), or ordered ( $a<b$ or $b<a$ ). The ordered relation is defined as: Vector $a<b$ if and only if every element of ${\displaystyle V_{a))$ is less than or equal to its corresponding element in ${\displaystyle V_{b))$ , and at least one of the elements is strictly less than. If neither $a<b$ or $b<a$ , but the vectors are not identical, then the two vectors must be concurrent.

Version vectors^[1] or variants are used to track updates in many distributed file systems, such as Coda (file system) and Ficus, and are the main data structure behind optimistic replication.^[2]

Other mechanisms

Hash Histories ^[3] avoid the use of counters by keeping a set of hashes of each updated version and comparing those sets by set inclusion. However this mechanism can only give probabilistic guarantees.
Concise Version Vectors ^[4] allow significant space savings when handling multiple replicated items, such as in directory structures in filesystems.
Version Stamps ^[5] allow tracking of a variable number of replicas and do not resort to counters. This mechanism can depict scalability problems in some settings, but can be replaced by Interval Tree Clocks.
Interval Tree Clocks^[6] generalize version vectors and vector clocks and allows dynamic numbers of replicas/processes.
Bounded Version Vectors ^[7] allow a bounded implementation, with bounded size counters, as long as replica pairs can be atomically synchronized.
Dotted Version Vectors ^[8] address scalability with a small set of servers mediating replica access by a large number of concurrent clients.

Other mechanisms

References

External links