Prototype for CRDTs for Python

We implement several CRDTs in Python. Those implementations are prototypical, meaning we don’t intend them to be production-code, but to allow exploration of the subtleties around CRDTs so that we can implement them elsewhere.

Main reference:

Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. ‘A comprehensive study of Convergent and Commutative Replicated Data Types’; [Research Report] RR-7506, 2011, pp.50. <inria-00555588>.

—Available at https://hal.inria.fr/inria-00555588v1

This package requires Python 3.6+, and has been tested in CPython 3.6 and CPython 3.7.

xotl.crdt.base – Basic interfaces and API

Common interface for all CRDTs.

class xotl.crdt.base.CvRDT(*, process: xotl.crdt.base.Process)[source]

Base class for Convergent Replicated Data Types.

Basically this documents the expectation of each CvRDT. Subclasses must implement the following methods and attributes.

User facing API

The methods and properties in this sub-section are those expected all CRDTs expose to the users. These must be programmed to keep the expected semantics of the CRDT.

This base class, only describe them in an abstract way, sub-classes must provide their own implementation.

value

The current value that is managed by this CRDT.

This could be any type of value. But you must never assume changes to the value will be of any effect. Each CRDT implements methods to properly update its value.

This is a read-only property.

Internal (coordination layer) CRDT API.

Every CvRDT must implement these methods to initialize and update its state upon requests from the coordination layer.

merge(other: xotl.crdt.base.CvRDT) → None[source]

Update the CvRDT to account for the another replica’s state.

__le__(other)[source]

Compares two replicas for ‘<=’ in the semilattice.

This is NOT a relation of the value.

__eq__(other) → bool[source]

Compares two replicas for ‘==’ in the semilattice.

This is NOT a relation of the value.

Warning

The following two methods should only be used within the boundaries of a coordinated controlled layer. They may alter the internal state of CRDT in a way that could break the expected semantics unless you take measures to ensure it.

init() → None[source]

Set the initial state of a newly create CRDT.

reset() → None[source]

Reset the internal state of value, usually to the initial state.

class xotl.crdt.base.Process(name: str, order: int)[source]

Represents a process or node that holds replicated objects.

We require (for some CRDTs) that processes are uniquely named and totally ordered across the cluster. So when adding/removing a process you should take measures for not reusing old names.

Transmitting and receiving the CRDT state

The following two functions allow for CRDT to be transmitted from one process to another and/or saved in a file. They use pickle; which means you’re responsible for enforcing the required security.

xotl.crdt.base.get_state(crdt: xotl.crdt.base.CvRDT) → bytes[source]

Dumps the crdt in a way that is amenable for transmission/storage.

xotl.crdt.base.from_state(state: bytes) → xotl.crdt.base.CvRDT[source]

Reconstruct the CRDT from its dumped state.

state should be the result of calling get_state(). The following property should always hold:

assert crdt == from_state(get_state(crdt))

xotl.crdt.clocks – Clocks, Vector Clocks, etc.

Implements the Vector Clocks.

class xotl.crdt.clocks.VClock(dots: Sequence[xotl.crdt.clocks.Dot] = None)[source]
bump(process)[source]

Return a new VC with the process’s counter increased.

merge(*others) → xotl.crdt.clocks.VClock[source]

Return the least possible common descendant.

reset()[source]

Reset the clock.

Basically forget about all the clock state.

class xotl.crdt.clocks.Dot(process: xotl.crdt.base.Process, counter: int)[source]

A component on the vector clock.

xotl.crdt.counter – The Counter CRTDs

class xotl.crdt.counter.GCounter(*, process: xotl.crdt.base.Process)[source]

A increment-only counter.

User API

incr()[source]

Increases the counter by one.

class xotl.crdt.counter.PNCounter(*, process: xotl.crdt.base.Process)[source]

A counter that allows increments and decrements.

User API

incr()[source]

Increase the counter by one.

decr()[source]

Decreases the counter by one.

xotl.crdt.register – The Registers CRTDs

class xotl.crdt.register.LWWRegister(*, process: xotl.crdt.base.Process)[source]

The Last-Write-Wins Register.

If two processes set a value concurrently (as per vector clock counter) and with the same time stamp. The process with highest priority wins.

User API

set(value)[source]

Set the value of the register.

value should be an immutable object. Putting a mutable object may lead to unexpected behavior (specially if it implements an unsafe hash).

Internal CRDT API

__lshift__(other) → bool[source]

True is other wins.

other wins if:

  • its vector clock dominates ours (it descends from ours and knows even more than we do).
  • its vector clock is concurrent with ours but other is marked with a higher timestamp.
  • none of the above, but other’s process has higher priority

xotl.crdt.sets – The Sets CRTDs

class xotl.crdt.sets.GSet(*, process: xotl.crdt.base.Process)[source]

The Grow-only set.

User API

add(item)[source]

Add item to the set.

class xotl.crdt.sets.TwoPhaseSet(*, process: xotl.crdt.base.Process)[source]

User API

add(item) → None[source]

Add item to the set.

remove(item) → bool[source]

Remove item to the set.

If item is not in (this replica’s view of) the set, do nothing. If it was, remove it and the item will never in the set again.

class xotl.crdt.sets.USet(*, process: xotl.crdt.base.Process)[source]

The USet.

Warning

You must be careful using this directly. You MUST never add the same item twice. This as way to implement the ORSet.

User API

add(item) → None[source]

Add item to the set.

remove(item) → None[source]

Remove item from the set.

If item is not in (this replica’s view of) the set, nothing happens.

class xotl.crdt.sets.ORSet(*, process: xotl.crdt.base.Process)[source]

The Observed-Remove Set.

User API

add(item)[source]

Add item to the set.

remove(item)[source]

Remove item from the set.

We remove the observed instances of item in this replica; if this replica hasn’t any, do nothing. This also means that an add(x) at one replica concurrent with a remove(x) at another, will result in the item being kept.

xotl.crdt.testing.base – Common definition to tests CRDTs

We use hypothesis.stateful to create sequences of possible actions over any replica.

There are two approaches:

Based on a test-model

Each action is also recorded in a model object that maintains the expected state a replica must reach when synchronized.

Based on a full synchronization

The synchronization is done by a gossip protocol; after synchronization all replicas must have the same value. Notice this is not coordination for agreement.

xotl.crdt.testing.counters – Testing counters

Create the rule-based machines to test GCounter and PNCounter.

class xotl.crdt.testing.counters.GCounterMachine

The stateful machinery for GCounter.

class xotl.crdt.testing.counters.PNCounterMachine

The stateful machinery for PNCounter.

xotl.crdt.testing.registers – Testing registers

xotl.crdt.testing.sets – Testing sets

Changelog

Series 0.x

Unreleased. Release 0.3.0

Nothing yet.

2018-10-16. Release 0.2.0

  • Replace ‘process’ for ‘actor’ across the code-base.
  • Extract the state dump/reconstruction from CvRDT; add functions get_state() and from_state().
  • Add Process to capture the required interface of processes.
  • Remove the timestamp from the internal vector clock. The timestamp is only used in LWWRegister; it was wasteful to have it everywhere else unused.

2018-10-03. Release 0.1.1

Correct distribution files. No source changes.

2018-10-03. Release 0.1.0

Initial release with the implementation of GCounter, PNCounter, LWWRegister, and several sets. We also include a kind-of vector clock implementation in module clocks (we use it as a primitive, so the GCounter, for instance is just an adapter of the underlying vclock.)

Indices and tables