Methods for Concurrency control
There are main three methods for concurrency control. They are as follows:
1. Locking Methods
2. Time-stamp Methods
3. Optimistic Methods
1. Locking Methods of Concurrency Control :
"A lock is a variable, associated with the data item, which controls the access of that data item."
Locking is the most widely used form of the concurrency control. Locks are further divided into three fields:
- Lock Granularity
- Lock Types
1. Lock Granularity :
A database is basically represented as a collection of named data items. The size of the data item chosen as the unit of protection by a concurrency control program is
called GRANULARITY. Locking can take place at the following level :
i. Database level Locking :
- Database level.
- Table level.
- Page level.
- Row (Tuple) level.
- Attributes (fields) level.
At database level locking, the entire database is locked. Thus, it prevents the use of any tables in the database by transaction T2 while transaction T1 is being executed. Database level of locking is suitable for batch processes. Being very slow, it is unsuitable for on-line multi-user DBMSs.
ii. Table level Locking :
At table level locking, the entire table is locked. Thus, it prevents the access to any row (tuple) by transaction T2 while transaction T1 is using the table. if a transaction requires access to several tables, each table may be locked. However, two transactions can access the same database as long as they access different tables. Table level locking is less restrictive than database level. Table level locks are not suitable for multi-user DBMS
iii. Page level Locking :
At page level locking, the entire disk-page (or disk-block) is locked. A page has a fixed size such as 4 K, 8 K, 16 K, 32 K and so on. A table can span several pages, and a page can contain several rows (tuples) of one or more tables. Page level of locking is most suitable for multi-user DBMSs.
iv. Row (Tuple) level Locking :
At row level locking, particular row (or tuple) is locked. A lock exists for each row in each table of the database. The DBMS allows concurrent transactions to access different rows of the same table, even if the rows are located on the same page. The row level lock is much less restrictive than database level, table level, or page level locks. The row level locking improves the availability of data. However, the management of row level locking requires high overhead cost.
v. Attributes (fields) level Locking :
At attribute level locking, particular attribute (or field) is locked. Attribute level locking allows concurrent transactions to access the same row, as long as they require the use of different attributes within the row. The attribute level lock yields the most flexible multi-user data access. It requires a high level of computer overhead.
2. Lock Types :
The DBMS mailnly uses following types of locking techniques.
a. Binary Locking :
- Binary Locking
- Shared / Exclusive Locking
- Two - Phase Locking (2PL)
A binary lock can have two states or values: locked and unlocked (or 1 and 0, for simplicity). A distinct lock is associated with each database item X.
If the value of the lock on X is 1, item X cannot be accessed by a database operation that requests the item. If the value of the lock on X is 0, the item can be accessed when requested. We refer to the current value (or state) of the lock associated with item X as LOCK(X).
Two operations, lock_item and unlock_item, are used with binary locking.
A transaction requests access to an item X by first issuing a lock_item(X) operation. If LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) = 0, it is set to 1 (the transaction locks the item) and the transaction is allowed to access item X.
When the transaction is through using the item, it issues an unlock_item(X) operation, which sets LOCK(X) to 0 (unlocks the item) so that X may be accessed by other transactions. Hence, a binary lock enforces mutual exclusion on the data item ; i.e., at a time only one transaction can hold a lock.
b. Shared / Exclusive Locking :
Shared lock :
These locks are reffered as read locks, and denoted by 'S'.
If a transaction T has obtained Shared-lock on data item X, then T can read X, but cannot write X. Multiple Shared lock can be placed simultaneously on a data item.
Exclusive lock :
These Locks are referred as Write locks, and denoted by 'X'.
If a transaction T has obtained Exclusive lock on data item X, then T can be read as well as write X. Only one Exclusive lock can be placed on a data item at a time. This means multipls transactions does not modify the same data simultaneously.
c. Two-Phase Locking (2PL) :
Two-phase locking (also called 2PL) is a method or a protocol of controlling concurrent processing in which all locking operations precede the first unlocking operation. Thus, a transaction is said to follow the two-phase locking protocol if all locking operations (such as read_Lock, write_Lock) precede the first unlock operation in the transaction. Two-phase locking is the standard protocol used to maintain level 3 consistency 2PL defines how transactions acquire and relinquish locks. The essential discipline is that after a transaction has released a lock it may not obtain any further locks. 2PL has the following two phases:
phase, in which a transaction acquires all the required locks without unlocking any data. Once all locks have been acquired, the transaction is in its locked
phase, in which a transaction releases all locks and cannot obtain any new lock.
A transaction shows Two-Phase Locking lechnique.
||Lock - X (A)
||acquire Exclusive lock on A.
||read original value of A
||A = A - 100
||subtract 100 from A
||write new value of A
||Lock - X (B)
||acquire Exclusive lock on B.
||read original value of B
||B = B + 100
||add 100 to B
||write new value of B
||release lock on A
||release lock on B
3. Deadlocks :
A deadlock is a condition in which two (or more) transactions in a set are waiting simultaneously for locks held by some other transaction in the set.
Neither transaction can continue because each transaction in the set is on a waiting queue, waiting for one of the other transactions in the set to release the lock on an item. Thus, a deadlock is an impasse that may result when two or more transactions are each waiting for locks to be released that are held by the other. Transactions whose lock requests have been refused are queued until the lock can be granted.
A deadlock is also called a circular waiting condition where two transactions are waiting (directly or indirectly) for each other. Thus in a deadlock, two transactions are mutually excluded from accessing the next record required to complete their transactions, also called a deadly embrace.
A deadlock exists two transactions A and B exist in the following example:
Transaction A = access data items X and Y
Transaction B = access data items Y and X
Here, Transaction-A has aquired lock on X and is waiting to acquire lock on y. While, Transaction-B has aquired lock on Y and is waiting to aquire lock on X. But, none of them can execute further.
|Lock (X) (acquired lock on X)
||Lock (Y) (acquired lock on Y)
|Lock (Y) (request lock on Y)
||Lock (X) (request lock on X)
Deadlock Detection and Prevention:
This technique allows deadlock to occur, but then, it detects it and solves it. Here, a database is periodically checked for deadlocks. If a deadlock is detected, one of the transactions, involved in deadlock cycle, is aborted. other transaction continue their execution. An aborted transaction is rolled back and restarted.
Deadlock prevention technique avoids the conditions that lead to deadlocking. It requires that every transaction lock all data items it needs in advance. If any of the items cannot be obtained, none of the items are locked. In other words, a transaction requesting a new lock is aborted if there is the possibility that a deadlock can occur. Thus, a timeout may be used to abort transactions that have been idle for too long. This is a simple but indiscriminate approach. If the transaction is aborted, all the changes made by this transaction are rolled back and all locks obtained by the transaction are released. The transaction is then rescheduled for execution. Deadlock prevention technique is used in two-phase locking.
2. Time-Stamp Methods for Concurrency control :
Timestamp is a unique identifier created by the DBMS to identify the relative starting time of a transaction.
Typically, timestamp values are assigned in the order in which the transactions are submitted to the system. So, a timestamp can be thought of as the transaction start time. Therefore, time stamping is a method of concurrency control in which each transaction is assigned a transaction timestamp. Timestamps must have two properties namely
- Uniqueness : The uniqueness property assures that no equal timestamp values can exist.
- monotonicity : monotonicity assures that timestamp values always increase.
Timestamp are divided into further fields :
- Granule Timestamps
- Timestamp Ordering
- Conflict Resolution in Timestamps
1. Granule Timestamps :
Granule timestamp is a record of the timestamp of the last transaction to access it. Each granule accessed by an active transaction must have a granule timestamp.
A separate record of last Read and Write accesses may be kept. Granule timestamp may cause. Additional Write operations for Read accesses if they are stored with the granules. The problem can be avoided by maintaining granule timestamps as an in-memory table. The table may be of limited size, since conflicts may only occur between current transactions. An entry in a granule timestamp table consists of the granule identifier and the transaction timestamp. The record containing the largest (latest) granule timestamp removed from the table is also maintained. A search for a granule timestamp, using the granule identifier, will either be successful or will use the largest removed timestamp.
2. Timestamp Ordering :
Following are the three basic variants of timestamp-based methods of concurrency control:
- Total timestamp ordering
- Partial timestamp ordering
- Multiversion timestamp ordering
(a) Total timestamp ordering :
The total timestamp ordering algorithm depends on maintaining access to granules in timestamp order by aborting one of the transactions involved in any conflicting access. No distinction is made between Read and Write access, so only a single value is required for each granule timestamp .
(b)Partial timestamp ordering :
In a partial timestamp ordering, only non-permutable actions are ordered to improve upon the total timestamp ordering. In this case, both Read and Write granule timestamps are stored.
The algorithm allows the granule to be read by any transaction younger than the last transaction that updated the granule. A transaction is aborted if it tries to update a granule that has previously been accessed by a younger transaction. The partial timestamp ordering algorithm aborts fewer transactions than the total timestamp ordering algorithm, at the cost of extra storage for granule timestamps
(c) Multiversion Timestamp ordering :
The multiversion timestamp ordering algorithm stores several versions of an updated granule, allowing transactions to see a consistent set of versions for all granules it accesses. So, it reduces the conflicts that result in transaction restarts to those where there is a Write-Write conflict. Each update of a granule creates a new version, with an associated granule timestamp.
A transaction that requires read access to the granule sees the youngest version that is older than the transaction. That is, the version having a timestamp equal to or immediately below the transaction's timestamp.
3. Conflict Resolution in Timestamps :
To deal with conflicts in timestamp algorithms, some transactions involved in conflicts are made to wait and to abort others.
Following are the main strategies of conflict resolution in timestamps:
- The older transaction waits for the younger if the younger has accessed the granule first.
- The younger transaction is aborted (dies) and restarted if it tries to access a granule after an older concurrent transaction.
- The older transaction pre-empts the younger by suspending (wounding) it if the younger transaction tries to access a granule after an older concurrent transaction.
- An older transaction will wait for a younger one to commit if the younger has accessed a granule that both want.
The handling of aborted transactions is an important aspect of conflict resolution algorithm. In the case that the aborted transaction is the one requesting access, the transaction must be restarted with a new (younger) timestamp. It is possible that the transaction can be repeatedly aborted if there are conflicts with other transactions.
An aborted transaction that had prior access to granule where conflict occurred can be restarted with the same timestamp. This will take priority by eliminating the possibility of transaction being continuously locked out.
Drawbacks of Time-stamp
- Each value stored in the database requires two additional timestamp fields, one for the lasttime the field (attribute) was read and one for the last update.
- It increases the memory requirements and the processing overhead of database.
3. Optimistic Methods of Concurrency Control :
The optimistic method of concurrency control is based on the assumption that conflicts of database operations are rare and that it is better to let transactions run to completion and only check for conflicts before they commit.
An optimistic concurrency control method is also known as validation or certification methods. No checking is done while the transaction is executing. The optimistic method does not require locking or timestamping techniques. Instead, a transaction is executed without restrictions until it is committed. In optimistic methods, each transaction moves through the following phases:
- Read phase.
- Validation or certification phase.
- Write phase.
a. Read phase :
In a Read phase, the updates are prepared using private (or local) copies (or versions) of the granule. In this phase, the transaction reads values of committed data from the database, executes the needed computations, and makes the updates to a private copy of the database values. All update operations of the transaction are recorded in a temporary update file, which is not accessed by the remaining transactions.
It is conventional to allocate a timestamp to each transaction at the end of its Read to determine the set of transactions that must be examined by the validation procedure. These set of transactions are those who have finished their Read phases since the start of the transaction being verified
b. Validation or certification phase :
In a validation (or certification) phase, the transaction is validated to assure that the changes made will not affect the integrity and consistency of the database.
If the validation test is positive, the transaction goes to the write phase. If the validation test is negative, the transaction is restarted, and the changes are discarded. Thus, in this phase the list of granules is checked for conflicts. If conflicts are detected in this phase, the transaction is aborted and restarted. The validation algorithm must check that the transaction has :
- Seen all modifications of transactions committed after it starts.
- Not read granules updated by a transaction committed after its start.
c. Write phase :
In a Write phase, the changes are permanently applied to the database and the updated granules are made public. Otherwise, the updates are discarded and the transaction is restarted. This phase is only for the Read-Write transactions and not for Read-only transactions.
Advantages of Optimistic Methods for Concurrency Control :
- This technique is very efficient when conflicts are rare. The occasional conflicts result in the transaction roll back.
- The rollback involves only the local copy of data, the database is not involved and thus there will not be any cascading rollbacks.
Problems of Optimistic Methods for Concurrency Control :
- Conflicts are expensive to deal with, since the conflicting transaction must be rolled back.
- Longer transactions are more likely to have conflicts and may be repeatedly rolled back because of conflicts with short transactions.
Applications of Optimistic Methods for Concurrency Control :
- Only suitable for environments where there are few conflicts and no long transactions.
- Acceptable for mostly Read or Query database systems that require very few update transactions