Database Concurrency Control

Serialisability

Definition: A concurrent execution of transactions should always have the same end result as some serial execution of those same transactions.

Rule of thumb: r1(x) , … , w1(x), … , r2(x), … , w2(x)

Explanation: obiously is not reading on the same object that messes things up, but only writing on it. Before T1 writes x only T1 reads are allowed, thus maintaining the strict alternation of the rule of thumb; r1(x), r2(x), w1(x) is clearly dead wrong.

A conflict occurs when two different transactions write on the same object:

r1(x), w2(x)
w1(x), r2(x)
w1(x), w2(x)

it is something I should watch out for cause it might leads to problems.

A History is Conflict Serialisable (CSR) if its serialisation graph is acyclic.

Extra tip: drawing the graph will immediately show if there are loops (not serialisable) or not; in the latest example that would be 1 -> 2 -> 1; however, even if the graph if CSR there still might be anomalies, and the history might still be not serializable (although it could be, since it’s CSR).

Examples:

r2[34] , r1[56] , w1[56] , r1[34] , w1[34] , c1 , w2[34], c2

not serialisable: r2[34], r1[34], w2[34]; it is recoverable though.

r2[34] , w2[34] , r1[56] , w1[56] , r1[34] , w1[34], c2 , c1

serialisable and recoverable

Anomalies

Lost update

r1(x), r2(x), w1(x), w2(x)

w1(x) is lost in transaction :P Note however that that although this history is CSR (the graph is acyclic) I think (not 100% sure) is non-serializable: this is for sure NOT the same of running T1 first and then T2.

Inconsistent analysis

r1(x), w1(x), r2(x), r2(y), r1(y), w1(y)

say T2 sums up x and y, but T1 is moving some cash from x to y; the analysis performed by T2 is inconsistent.

Dirty Read: non-recoverable (if a transaction aborts)

w1(x), r2(x), c2, c1

when you read from a T that hasn’t committed yet: if T1 aborts, r2 read garbage data.

Dirty Write

w1(x), w2(x), w2(y), w1(y), c1, c2

when you write to a T that hasn’ committed yet: this history is clearly not serializable: they both write their stuff and the outcome depends only on the order. It is recoverable, though, since no transaction is dependent on anything else. You simply UNDO every write and there should be no problems.

Recoverability

Recoverable History: No transaction commits depending on data that has been produced by another transaction that has yet to commit
Rule of thumb: the first who writes has to be the first who commits iff a T read before the commit

Avoiding Cascading Aborts (ACA):  => avoids Dirty Reads
If H1 reads from H2 and H2 aborts, H1 has to abort as well.

Strict (ST): => avoids Dirty Reads and Dirty Writes

ST is included in ACA is included in RC.

Examples of recoverability analysis

H1 = r2[o1], r2[o2], w2[o2], r1[o2], w2[o1], r2[o3], c2, c1

o1: r2, w2, c2, c1
o2: r2, w2, r1, c2, c1
  1. T2 commits first -> recoverable
  2. Dirty read on o2 (can’t be ACA or ST)
H2 =r2[o1], r2[o2], w2[o1], w2[o2], w1[o1], w1[o2], c1, r2[o3], c2

o1: r2, w2, w1, c1, c2
o2: same
  1. T1 commits first but it doesn’t depend on T2: recoverable
  2. No dirty reads: ACA
  3. Dirty write though (can’t be ST)
H3 = r2[o1], r2[o2], w2[o2], r1[o2], w2[o1], c1, r2[o3], c2

o2: r2, w2, r1, c1, c2
  1. T1 commits before T2 but is depending on T2: non-recoverable
H4 = r2[o1], w1[o1], r2[o2], w2[o2], r2[o3], c2, r1[o2], w1[o2], w1[o3], c1

o1: r2, w1, c2, c1
o2: r2, w2, c2, r1, w1, c1
o3: r2, c2, w1, c1
  1. No T commits before another T it depends on: recoverable
  2. No dirty reads: ACA
  3. No dirty writes: ST

Two-phase Locking

It is a technique to ensure both serialisability and recoverability. The T first locks every object it wants to r/w, does whatever it has to, and when it’s done it realeases all the blocks. The structure is:

read locks rl[o], ..., r[o], ..., ru[o]
write locks wl[o], ..., w[o], ..., wu[o]

refuse rl-i[o] if wl-j[o] already held
refuse wl-i[o] if rl-j[o] or wl-j[o] already held

In other words: another T cannot read on a write-lock, and another T cannot write on both a read and write lock.

Note that a T must lock all the objects it wants do read/write (growing phase) and then unlocking all of them (shrinking phase); in other words, T can lock, do stuff, lock more, do stuff as long as it wants, but after the first unlock it cannot lock anymore;  therefore this H is valid in 2LP

wl1[x] , wl1[y] , r1[x] , w1[x] , r1[y] , w1[y] , wu1[y] , wu1[x] 

whereas this one is not

wl1[x] , r1[x] , w1[x] , wu1[x] , wl1[y] , r1[y] , w1[y] , wu1[y]

However, even with 2LP there are stuff to watch out for:

Deadolock Detection

A deadlock between two Ts occurs when, for example, two Ts have a read lock on x, and the next step for both requires a write lock on x. T1 will be waiting for T2 to unlock and the other way around. An easy way to spot deadlocks is to draw a Wait-For Graph (WFG).

H1: r1[o1], w1[o1]
H2: r2[o1], w2[o1]

r1[o1], r2[o1] // deadlock

A deadlock between three Ts occurs when the cycle involves all of the, e.g. T1 -> T3 -> T2 -> T1. Example of a dreadlock history:

H1: w1[o1], r1[o2], r1[o4]
H2: r2[o3], r2[o2], r2[o1]
H3: r3[o4], w3[o4], r3[o3], w3[o3]

w1[o1], r2[o3], r2[o2], r1[o2], r3[o4], w3[o4], r3[o3] // deadlock

How to do it: write down all the conflict pairs, draw a deadlock graph and the plug the pieces in the right order.