# CS计算机代考程序代写 data structure distributed system CRDT – CONFLICT‐FREE REPLICATED DATA TYPES

CRDT – CONFLICT‐FREE REPLICATED DATA TYPES

Distributed Systems (Hans‐Arno Jacobsen) 1

Pixabay.com

CRDTs Units

• Eventual consistency, informally

• State‐based objects

• Eventual consistency, more formally • Conflict‐free replicated data types

Distributed Systems (Hans‐Arno Jacobsen) 2

EVENTUAL CONSISTENCY, INFORMALLY

Distributed Systems (Hans‐Arno Jacobsen)

3

Pixabay.com

Eventual Consistency

• Eventualconsistencyisdesirableforlarge‐scale distributed systems where high availability is important

• Tendstobecheaptoimplement(e.g.,viagossip)but may serve stale data

• Constitutesachallengeforenvironmentswhere stronger consistency is important

Distributed Systems (Hans‐Arno Jacobsen) 4

Handling Concurrent Writes

• Premise for eventual consistency were scenarios with few (no) concurrent writes to the same key (cf. client‐centric consistency)

• However, we do need a mechanism to handle concurrent writes should they so happen

• If there were a way to handle concurrent writes, we could support eventual consistency more broadly

• Would “only” need to guarantee that after processing all writes for a key, all replicas converge, no matter what order the writes are processed (e.g., assuming gossip)

Distributed Systems (Hans‐Arno Jacobsen) 5

Max register L1: 0 W(4) (4) W(2) (4)

Examples

Growth‐only counter (G‐counter)

L1: 0 W(+5) (5) W(+2) (7) W(+1) 8

L2: 0 W(+2) (2) W(+5) (7) W(+1) 8 Writes propagate to L2, L1, respectively

✔

Different locations (replicas)

merge(5) 5 L2: 0 W(5) (5) W(3) (5) merge(4) ✔

State propagate to L2, L1 via periodic merging

Distributed Systems (Hans‐Arno Jacobsen)

6

5

Self‐study Questions

• Think of a few basic data structures, like lists, sets, counters, binary trees, heaps, maps, etc., and visualize for yourself what happens if replicated instances of these structures are updated via gossip.

• Does their state converge, no matter the update sequence?

• What happens if update operations are lost or duplicated?

• What mechanisms we know other than gossip could be used to keep these replicated structures updated without violating their convergence.

• What are pros and cons of these mechanisms?

Distributed Systems (Hans‐Arno Jacobsen) 7

Distributed Systems (Hans‐Arno Jacobsen) 8

CRDT – FROM STATE‐BASED OBJECTS TO REPLICATED STATE‐BASED OBJECTS

Distributed Systems (Hans‐Arno Jacobsen)

9

Pixabay.com

State‐based objects Mostly plain old objects

• Offerupdateandqueryrequeststoclients

• Maintaininternalstate

• Processclientrequests

• Perform merge requests amongst each other • Periodicallymerge(supportinfrastructure)

Distributed Systems (Hans‐Arno Jacobsen) 10

State‐based Object

• What we commonly know as object • Comprised of

– Internal state

– One or more query methods – One or more update methods – A merge method

Distributed Systems (Hans‐Arno Jacobsen) 11

class Avg(object): def __init__(self):

def update(self, x): self.sum += x self.cnt += 1

self.sum = 0

self.cnt = 0

def query(self): if self.cnt != 0:

def merge(self, avg): self.sum += avg.sum self.cnt += avg.cnt

return

else: return 0

self.sum /

self.cnt

Class Average Running Example

Distributed Systems (Hans‐Arno Jacobsen) 12

Average

State‐based object representing a running average

• Internalstate

– self.sum and self.cnt

• Query returns average

• Update updates average with a new value x

• Merge merges one Avg instance into another one

Distributed Systems (Hans‐Arno Jacobsen) 13

Replicated State‐based Object

• State‐based object replicated across multiple nodes

• E.g., replicate Avg across two nodes

• Both nodes have a copy of state‐based object

• Clients send query and update to a single node

• Nodes periodically send their copy of state‐based object to other nodes for merging

Distributed Systems (Hans‐Arno Jacobsen) 14

Node a

a0 Timeline Update

Unique

Causal history based on operation identifiers

operation identifier

Each state represents a snapshot of object in time that results from updates applied

state

query

history

State

a1

a0 sum:0, cnt:0

0

a1 sum:1, cnt:1

1 0

a2 sum:4, cnt:2 Distributed Systems (Hans‐Arno Jacobsen)

2 0,1

State

15

Operation identifier is unique across replicas

Each state represents a snapshot of object in time that results from updates applied

state

query

history

Timeline

a0 sum:0, cnt:0

0

a1 sum:1, cnt:1

1 0

a2 sum:4, cnt:2

2 0,1

Distributed Systems (Hans‐Arno Jacobsen)

16

States and Causal Histories

If y = x.update(…) where the update has identifier i, then the causal history of y is the causal history of x union { i }.

a0 sum:0, a1 sum:1, b0 sum:0, b1 sum:2, b2 sum:6,

cnt:0 cnt:1 cnt:0 cnt:1 cnt:2

0 1 0 2 3

{} {0} {} {1} {1,2}

Distributed Systems (Hans‐Arno Jacobsen)

17

state

query ()

history

a0 sum:0,

cnt:0 cnt:1 cnt:0 cnt:1 cnt:2

0{} 2 {0} 0{} 4 {1}

a1 sum:2,

b0 sum:0,

b1 sum:4,

update 2 update 4

state

query () history

Merge

a2 sum:6,

Distributed Systems (Hans‐Arno Jacobsen)

3 {0,1}

18

Nodes Periodically Propagate Their State

Distributed Systems (Hans‐Arno Jacobsen) 19

Self‐study Questions

• Think of a few basic data structures, like lists, sets, counters, binary trees, heaps, maps, etc., and visualize for yourself what happens if replicated instances of these structures are updated via gossip.

• For the above data structures, specify merge operations that merge the state of two instances of a given structure.

• Assume merge happens periodically, does your replicated structures’ state converge?

Distributed Systems (Hans‐Arno Jacobsen) 20

Distributed Systems (Hans‐Arno Jacobsen) 21

CRDT –

EVENTUAL CONSISTENCY, MORE FORMALLY

Distributed Systems (Hans‐Arno Jacobsen)

22

Pixabay.com

Eventual Consistency (EC)

• A replicated state‐based object is

– eventually consistent if whenever two replicas of the state‐based object have the same causal history, they eventually (not necessarily immediately) converge to the same internal state

Distributed Systems (Hans‐Arno Jacobsen) 23

Strong Eventual Consistency (SEC)

• A replicated state‐based object is

– strongly eventually consistent if whenever two replicas of the state‐based object have the same causal history, they (immediately) have the same internal state

• Strong eventual consistency implies eventual consistency

Distributed Systems (Hans‐Arno Jacobsen) 24

– NoMergeAverage – BMergeAverage – MaxAverage

EC or SEC

That is the question?

• Variants of our Average object, defined next – Average

• Note that some of these objects do not represent realistic functionality (i.e., needed functionality)

• These objects are meant to illustrate convergence concepts only

Distributed Systems (Hans‐Arno Jacobsen) 25

Average

a, b attain the same causal history but do not converge to the same internal state – they do not converge at all!

state

query history

Neither eventually

b0 sum:0, cnt:0 b1 sum:4, cnt:1

0

consistent, nor

4

1

strongly eventually

b2 sum:10, cnt:3 b3 sum:26, cnt:8

3.3 3.25

0,1

consistent

0,1

Distributed Systems (Hans‐Arno Jacobsen)

a0 sum:0, cnt:0

a1 sum:2, cnt:1

a2 sum:6, cnt:2

a3 sum:16, cnt:5 3.2 0,1

0 2 0 3 0,1

26

NoMergeAverage

• Object’s merge does nothing

• All else is the same as for Average

Distributed Systems (Hans‐Arno Jacobsen) 28

a, b have same causal history, both converge to a stable but different internal state.

state

query

history

Neither eventually consistent, nor strongly eventually consistent.

a0 sum:0, a1 sum:2, a2 sum:2, a3 sum:2, b0 sum:0, b1 sum:4, b2 sum:4, b3 sum:4,

cnt:0 cnt:1 cnt:1 cnt:1 cnt:0 cnt:1 cnt:1 cnt:1

0 2 2 2 0 4 4 4

0 0,1 0,1 1 0,1 0,129

Distributed Systems (Hans‐Arno Jacobsen)

BMergeAverage

• Object’s merge

– At b – overwrite state with state at a – At a – do nothing

• All else is the same as for Average

Distributed Systems (Hans‐Arno Jacobsen) 30

a, b attain same causal history, both eventually converge to the same internal state – eventual consistent.

state

query history

a1, b1 have same causal history but different internal state – not strongly eventually consistent

0 0 0 0 0 4 0 0 0

a0 sum:0, cnt:0 a1 sum:0, cnt:0 a2 sum:0, cnt:0 b0 sum:0, cnt:0 b1 sum:4, cnt:1 b2 sum:0, cnt:0

0

Distributed Systems (Hans‐Arno Jacobsen)

31

MaxAverage

• Object’s merge

– Pair‐wise max of sum and cnt

• All else is the same as for Average

Distributed Systems (Hans‐Arno Jacobsen) 32

At a, b for all states with the same causal history, they have the same internal state – strongly eventually consistent.

state

query

history

Great!!! But, what

0,1

does it actually

compute? Here,

1

update(2) overwritten

0,1

by update(4)! Distributed Systems (Hans‐Arno Jacobsen)

0,1

a0 sum:0,

a1 sum:2,

a2 sum:4,

a3 sum:4,

b0 sum:0,

b1 sum:4,

b2 sum:4,

b3 sum:4,

cnt:0 cnt:1 cnt:1 cnt:1 cnt:0 cnt:1 cnt:1 cnt:1

0 2 4 4 0 4 4 4

0

0,1

33

Lessons Learned I

• Same causal history, different internal state

• Same causal history, converge to stable but different internal state

• Same causal history, eventually same internal state – EC

• Same causal history, always same internal state – SEC

Average NoMergeAverage BMergeAverage MaxAverage

no no no yes no no yes yes no yes yes yes

C? EC? SEC?

Designing a strongly eventually consistent state‐based object with intuitive semantics is challenging!

Distributed Systems (Hans‐Arno Jacobsen) 34

Lessons Learned II

• Replicatedstate‐basedobject

• No convergence

• Convergence

• Eventualconsistencyinthismodel

• Strongeventualconsistencyinthismodel

Distributed Systems (Hans‐Arno Jacobsen) 35

Self‐study Questions

• Can you design Average such that it becomes EC or SEC as well as offers correct averaging semantics?

• Think of other data structures and design update, query, and merge operations with reasonable semantics.

• Always draw timelines and state diagrams for your designs and proof EC or SEC, if possible.

• Think of data structures that support multiple update operations and one or more query operations.

Distributed Systems (Hans‐Arno Jacobsen) 36

Distributed Systems (Hans‐Arno Jacobsen) 37

CRDT –

CONFLICT‐FREE REPLICATED DATA TYPES, 2011

Distributed Systems (Hans‐Arno Jacobsen) 38

Pixabay.comv

• • •

A CRDT is a conflict‐free replicated state‐based object

•

CRDTs are no panacea but a great solution when they apply!

Conflict‐Free Replicated Data Types

A CRDT handles concurrent writes

Intuition ‐ restrictions:

– Do not allow writes with arbitrary values, limit to write

operations which are guaranteed not to conflict

– CRDTs are data structures with special write operations; they guarantee strong eventual consistency and are monotonic (no rollbacks)

Distributed Systems (Hans‐Arno Jacobsen) 39

•

Conflict‐Free Replicated Data Types CRDTs can be commutative, op‐based (CmRDT):

•

CRDTs can be convergent, state‐based (CvRDT):

– Example: A max register, which stores the maximum

•

Therefore, the value of a CRDT depends on multiple write operations or states, not just the latest one`

– Example: A growth‐only counter, which can only process increment operations

– Propagate operations among replicas (duplicate‐free, no‐loss messaging)

value written

– Propagate and merge states (idempotent)

Distributed Systems (Hans‐Arno Jacobsen) 40

• Supports – Query

CmCRDTs and CvCRDTs are equivalent. One can be transformed into the other one and vice versa.

– Update – Merge

State‐based CRDTs

• A CRDT is a replicated state‐based object

Distributed Systems (Hans‐Arno Jacobsen) 41

CRDT Properties

A CRDT is a replicated state‐based object that satisfies

• Mergeisassociative(e.g.,(A+(B+C))=((A+B)+C)) – For any three state‐based objects x, y, and z,

merge(merge(x, y), z) is equal to merge(x, merge(y, z)) • Mergeiscommutative(e.g.,A+B=B+A)

– For any two state‐based objects, x and y, merge(x, y) is equal to merge(y, x)

• Merge is idempotent

– For any state‐based object x, merge(x, x) is equal to x

• Every update is increasing

– Let x be a state‐based object and let y = update(x, …) be

the result of applying an update to x

– Then, update is increasing if merge(x, y) is equal to y

Distributed Systems (Hans‐Arno Jacobsen) 42

max of a, b

self.x = 0 def query(self):

Max Register is a CRDT The state‐based object IntMax is a CRDT

• IntMax wraps an integer • Merge(a, b)is the

class IntMax(object): def __init__(self):

• Update(x)adds x to the wrapped integer

return self.x

def update(self, x):

• Prove that IntMax is associative, commutative, idempotent, increasing

self.x += x def merge(self,

Distributed Systems (Hans‐Arno Jacobsen)

43

assert x >= 0

other):

self.x =

max(self.x,

other.x)

Establish Four Properties of CRDT

• Associativity merge(merge(a, b), c)

= max(max(a.x, b.x), c.x)

= max(a.x, max(b.x, c.x))

= merge(a, merge(b, c))

• Impotence merge(a, a)

= max(a.x, a.x) = a.x

= a

• Commutativity merge(a, b)

= max(a.x, b.x) = max(b.x, a.x) = merge(b, a)

• Update is increasing merge(a, update(a, x)) = max(a.x, a.x + x)

= a.x + x

= update(a, x)

Distributed Systems (Hans‐Arno Jacobsen) 44

G‐Counter CRDT Replicated growth‐only counter

• Internal state of a G‐Counter replicated on n nodes is an n‐length array of non‐negative integers

• query returns sum of every element in n‐length array • add(x)when invoked on the i‐th node, increments

the i‐th entry of the n‐length array by x

– E.g., Node 0 increments 0th entry, Node 1 increments 1st

entry of array, and so on

• merge performs a pairwise maximum of the two arrays

Distributed Systems (Hans‐Arno Jacobsen) 45

PN‐Counter CRDT

Replicated counter supporting addition & subtraction

• Internal state of a PN‐Counter

– pair of two G‐Counters named p and n.

• p represents total value added to PN‐Counter

• n represents total value subtracted from PN‐Counter.

• query method returns difference p.query() – n.query()

• add(x)- first of two updates: invokes p.add(x)

• sub(x)- second of two updates: invokes n.add(x)

• merge performs a pairwise merge of p and n

Distributed Systems (Hans‐Arno Jacobsen) 47

G‐Set CRDT Replicated growth‐only set

A G‐Set CRDT represents a replicated set which can be added to but not removed from

• Internal state of a G‐Set is just a set

• query returns the set

• add(x)adds x to the set

• merge performs a set union

Distributed Systems (Hans‐Arno Jacobsen) 49

2P‐Set CRDT

Replicated set supporting addition and subtraction

• Internalstateofa2P‐Setisa

– pair of two G‐Sets named a and r

• a represents set of values added to the 2P‐Set

• r represents set of values removed from the 2P‐Set

• query method returns the set difference a.query() – r.query()

• add(x) is the first of two updates – invokes a.add(x).

• sub(x)is the second of two updates – invokesr.add(x)

• merge performs a pairwise merge of a and r Distributed Systems (Hans‐Arno Jacobsen) 51

Summary on CRDTs

• Formalized and introduced in 2011/2014 • CmCRDTs and CvCRDTs are equivalent!

• Really neat solution if applicable

• Challenge is to design new CRDTs

Distributed Systems (Hans‐Arno Jacobsen) 53

Self‐study Questions

• For all CRDTs introduced, establish its four properties.

• Create sample execution sequences for each CRDT and

complete a timeline and a state table.

• Find use cases where the introduced CRDTs apply and show how they are used.

• Think of new CRDTs and repeat the above.

Distributed Systems (Hans‐Arno Jacobsen) 1

Distributed Systems (Hans‐Arno Jacobsen) 55