Prototype for CRDTs for Python¶
We implement several CRDTs in Python. Those implementations are prototypical, meaning we don’t intend them to be production-code, but to allow exploration of the subtleties around CRDTs so that we can implement them elsewhere.
Main reference:
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. ‘A comprehensive study of Convergent and Commutative Replicated Data Types’; [Research Report] RR-7506, 2011, pp.50. <inria-00555588>.
—Available at https://hal.inria.fr/inria-00555588v1
This package requires Python 3.6+, and has been tested in CPython 3.6 and CPython 3.7.
xotl.crdt.base
– Basic interfaces and API¶
Common interface for all CRDTs.
-
class
xotl.crdt.base.
CvRDT
(*, process: xotl.crdt.base.Process)[source]¶ Base class for Convergent Replicated Data Types.
Basically this documents the expectation of each CvRDT. Subclasses must implement the following methods and attributes.
User facing API
The methods and properties in this sub-section are those expected all CRDTs expose to the users. These must be programmed to keep the expected semantics of the CRDT.
This base class, only describe them in an abstract way, sub-classes must provide their own implementation.
-
value
¶ The current value that is managed by this CRDT.
This could be any type of value. But you must never assume changes to the value will be of any effect. Each CRDT implements methods to properly update its value.
This is a read-only property.
Internal (coordination layer) CRDT API.
Every CvRDT must implement these methods to initialize and update its state upon requests from the coordination layer.
-
merge
(other: xotl.crdt.base.CvRDT) → None[source]¶ Update the CvRDT to account for the another replica’s state.
-
__le__
(other)[source]¶ Compares two replicas for ‘<=’ in the semilattice.
This is NOT a relation of the
value
.
-
__eq__
(other) → bool[source]¶ Compares two replicas for ‘==’ in the semilattice.
This is NOT a relation of the
value
.
Warning
The following two methods should only be used within the boundaries of a coordinated controlled layer. They may alter the internal state of CRDT in a way that could break the expected semantics unless you take measures to ensure it.
-
-
class
xotl.crdt.base.
Process
(name: str, order: int)[source]¶ Represents a process or node that holds replicated objects.
We require (for some CRDTs) that processes are uniquely named and totally ordered across the cluster. So when adding/removing a process you should take measures for not reusing old names.
Transmitting and receiving the CRDT state
The following two functions allow for CRDT to be transmitted from one process
to another and/or saved in a file. They use pickle
; which means you’re
responsible for enforcing the required security.
-
xotl.crdt.base.
get_state
(crdt: xotl.crdt.base.CvRDT) → bytes[source]¶ Dumps the crdt in a way that is amenable for transmission/storage.
-
xotl.crdt.base.
from_state
(state: bytes) → xotl.crdt.base.CvRDT[source]¶ Reconstruct the CRDT from its dumped state.
state should be the result of calling
get_state()
. The following property should always hold:assert crdt == from_state(get_state(crdt))
xotl.crdt.clocks
– Clocks, Vector Clocks, etc.¶
Implements the Vector Clocks.
xotl.crdt.counter
– The Counter CRTDs¶
xotl.crdt.register
– The Registers CRTDs¶
-
class
xotl.crdt.register.
LWWRegister
(*, process: xotl.crdt.base.Process)[source]¶ The Last-Write-Wins Register.
If two processes set a value concurrently (as per vector clock counter) and with the same time stamp. The process with highest
priority
wins.User API
-
set
(value)[source]¶ Set the value of the register.
value should be an immutable object. Putting a mutable object may lead to unexpected behavior (specially if it implements an unsafe hash).
Internal CRDT API
-
__lshift__
(other) → bool[source]¶ True is other wins.
other wins if:
- its vector clock dominates ours (it descends from ours and knows even more than we do).
- its vector clock is concurrent with ours but other is marked with a higher timestamp.
- none of the above, but other’s process has higher priority
-
xotl.crdt.sets
– The Sets CRTDs¶
-
class
xotl.crdt.sets.
USet
(*, process: xotl.crdt.base.Process)[source]¶ The USet.
Warning
You must be careful using this directly. You MUST never add the same item twice. This as way to implement the
ORSet
.User API
xotl.crdt.testing.base
– Common definition to tests CRDTs¶
We use hypothesis.stateful
to create sequences of possible actions over
any replica.
There are two approaches:
Based on a test-model
Each action is also recorded in a model object that maintains the expected state a replica must reach when synchronized.
Based on a full synchronization
The synchronization is done by agossip protocol
; after synchronization all replicas must have the same value. Notice this is not coordination for agreement.
xotl.crdt.testing.counters
– Testing counters¶
Create the rule-based machines to test GCounter
and PNCounter
.
xotl.crdt.testing.registers
– Testing registers¶
xotl.crdt.testing.sets
– Testing sets¶
Changelog¶
Series 0.x¶
Unreleased. Release 0.3.0¶
Nothing yet.
2018-10-16. Release 0.2.0¶
- Replace ‘process’ for ‘actor’ across the code-base.
- Extract the state dump/reconstruction from
CvRDT
; add functionsget_state()
andfrom_state()
. - Add
Process
to capture the required interface of processes. - Remove the timestamp from the internal vector clock. The timestamp is only
used in
LWWRegister
; it was wasteful to have it everywhere else unused.
2018-10-03. Release 0.1.1¶
Correct distribution files. No source changes.