Example: Transaction to transfer 700 Rs from A/C A to B A/C.
To ensure the integrity of data, a database system is required to maintain the following properties:
Examples:
Atomicity: Consider a funds transfer transaction where money is deducted from one account and added to another. Atomicity ensures that either both these operations happen successfully or none at all, preventing scenarios where money is deducted but not added or vice versa.
Consistency: In a database managing student grades, consistency ensures that if a transaction involves updating grades, the database moves from one consistent state (e.g., all grades are within a valid range) to another consistent state after the transaction is executed.
Isolation: Imagine two transactions updating different sets of records simultaneously. Isolation ensures that the changes made by one transaction do not affect the ongoing operations or the final outcome of the other transaction.
Durability: After a user confirms a purchase in an e-commerce system, durability ensures that the record of that transaction persists in the database even if there's a system failure immediately after the confirmation, safeguarding against data loss.
Concurrency refers to the simultaneous execution of multiple transactions.
EXAMPLE:
Time | T1 | T2 | T3
-------------------------
t1 | | R |
t2 | R | |
t3 | | | W
t4 | | R |
t5 | | R |
t6 | | W |
t7 | W | |
t8 | R | |
t9 | | | R
Example:
Consider two transactions, T1 and T2, where T1 reads the value of variable 'A' and T2 updates 'A' without considering the updated value from T1. This can result in a lost update problem as the update from T1 is overwritten by the update from T2.
Example:
Consider two transactions, T1 and T2, where T1 updates a variable 'X', but before committing the update, T2 reads the intermediate and uncommitted value of 'X'. This premature read by T2 may result in inaccurate or inconsistent data, illustrating the dirty read problem.
Example:
Imagine two transactions, T1 and T2, where T1 updates a variable 'V', and T2 reads or writes 'V' based on the update from T1. If T1 is rolled back due to a failure before committing, it creates an uncommitted dependency problem for T2. This is because T2 might have performed operations based on the temporary changes made by T1, which are now lost due to the rollback.
Example:
Consider two transactions, T1 and T2, where T1 reads a data item twice. Meanwhile, T2 updates the same data item between the two read operations of T1. This creates an unrepeatable read problem for T1 because the second read yields a different value than the first read due to the update performed by T2.
Example:
Suppose there are two transactions, T1 and T2. T1 reads a variable, and before T1 commits its changes, T2 updates the same variable. This leads to the inconsistent analysis problem for T1 because the value it read initially is now different from the updated value made by T2.
Example:
Consider two transactions, T1 and T2. T1 reads a set of variables, and in between its two read operations, T2 inserts or modifies a record that affects the range of data read by T1. When T1 reads the same set of variables again, it encounters a phantom read problem as the data it initially read has changed due to the insertion or modification by T2.
Example:
T1 | T2
_____________________
R(A) |
R(B) |
A = A + B |
W(A) |
| R(A)
| A = A - 500
| W(A)
In the above example:
Example:
Consider two transactions, T1 and T2. In a serial schedule, if T2 is being executed first, it means that T2 will be completely executed before T1 begins its execution. The operations of T2, including reads, writes, and any computations, will finish before T1's operations commence.
Serial schedules ensure a strict sequential execution of transactions, eliminating any interleaving of operations between transactions. This guarantees that the effects of one transaction are visible to the database before the next transaction begins, maintaining a clear order of execution.
While serial schedules provide simplicity and ease of analysis, they may lead to reduced concurrency and potentially longer execution times when compared to more concurrent scheduling methods.
Example:
Let's consider a non-serial schedule involving two transactions, T1 and T2. The non-serial schedule interleaves instructions between T1 and T2 to execute concurrently. To determine if it is serializable, we compare its result with that of a corresponding serial schedule.
Non-Serial Schedule:
T1: R(A)
T2: R(B)
T1: A = A + B
T2: B = B * 2
T1: W(A)
T2: W(B)
To check serializability, we compare this schedule with the serial schedule:
Serial Schedule:
T1: R(A)
T1: A = A + B
T1: W(A)
T2: R(B)
T2: B = B * 2
T2: W(B)
If the results of both schedules are the same, the non-serial schedule is serializable. If not, it indicates potential conflicts or dependencies that may need resolution for serializability.
Serializable schedules are crucial for maintaining consistency in a concurrent database environment, ensuring that the final state is equivalent to the state achieved through a serialized execution of transactions.
A precedence graph is employed to test the serializability of a non-serial schedule.
T1 -> T2
W(Q) R(Q)
R(Q) W(Q)
W(Q) W(Q)
R(Q) R(Q) (In this condition, there will be no edge)
Example:
T1 | T2 | T3
--------------------------------
R(A) | |
R(C) | |
W(A) | |
| R(B) |
W(C) | |
| R(A) |
| | R(C)
| W(B) |
| | R(B)
| | W(C)
| W(A) |
| | W(B)
As there is no cycle formed in this graph that means it is serializable schedule.
Another example:
T1 | T2 | T3
--------------------------------
R(A) | |
| R(B) |
| | R(C)
| W(B) |
| | W(C)
W(A) | |
| R(A) |
| |
R(C) | |
| W(A) |
W(C) | |
| | W(B)
If we have a non-serializable schedule and we aim to convert it into a serializable schedule, two methods can be employed:
T1 | T2
-----------
W(A) | W(A)
W(A) | R(A)
R(A) | W(A)
Example:
T1 | T2
---------------
R(A) |
W(A) |
| R(A)
| W(A)
R(B) |
W(B) |
| R(B)
| W(B)
In order to convert the given schedule into a conflict serializable form, we will identify conflicting operations and rearrange non-conflicting operations to achieve serializability. In the given example:
T1 | T2
----------------
R(A) |
W(A) |
| R(A)
| W(A)
R(B) |
W(B) |
| R(B)
| W(B)
The conflicting operations are W(A) in T1 and W(A) in T2, as well as W(B) in T1 and W(B) in T2. To make the schedule conflict serializable, we will rearrange non-conflicting operations:
T1 | T2
----------------
R(A) |
R(B) |
W(A) | R(A)
W(B) | W(A)
| R(B)
| W(B)
The modified schedule is now conflict serializable as it can be transformed into a serial schedule by rearranging conflicting operations. This ensures that the final result remains consistent with a serial execution of transactions.
Two schedules, s1 and s2, are view equivalent if they satisfy the following conditions:
S1 T1 | T2 S2 T1 | T2
------+------- ------+-------
R(A) | | W(A)
| W(A) R(A) |
In this example, R(A) is the initial read in Schedule 1, and in Schedule 2, R(A) is still the
initial read.
S1 T1 | T2 | T3 S2 T1 | T2 | T3
------+------+----- -----+------+------
W(A) | | | W(A) |
| W(A) | W(A) | |
| | R(A) | | R(A)
S1 T1 | T2 | T3 S2 T1 | T2 | T3
------+------+----- -----+------+------
W(A) | | | R(A) |
| R(A) | W(A) | |
| | W(A) | | W(A)
Example:
Consider the following schedule involving two transactions, T1 and T2:
T1 | T2
----------------------------------
R(A) |
W(A) |
| R(A)
| W(A) (Updated by T1)
W(B) |
In this example, T1 reads and writes to variable A, and T2 reads and writes to the same variable after T1. If a failure occurs after T2 reads A but before T2 writes A, the system needs to recover to a consistent state.
If the system can roll back the changes made by T1 (undo the write operation), it can ensure recoverability. In this case, the schedule is recoverable because the system can return to a consistent state despite the error or failure.
Example:
Consider the following schedule involving two transactions, T1 and T2, where T2 reads a data item before T1 commits:
T1 | T2
----------------------------------
W(A) |
R(A) |
R(B) |
| W(B)
| R(A) (Reads uncommitted value due to T1 not committed)
In this example, T2 reads data item A before T1 commits its write operation on A. If T1 encounters a failure and rolls back, it will result in T2 having read an uncommitted value of A. This situation represents cascading rollback.
To achieve a cascadeless schedule, transactions should be arranged such that a transaction commits before the data it updates is read by another transaction, preventing cascading rollbacks.
Granularity of Locks
Locking can occur at the following levels:
This topic falls under the umbrella of concurrency control.
Necessary Conditions for Deadlock:
These conditions indicate that our system is in a deadlock.
Deadlocks, once identified, can be managed using different methods to prevent or address them. There are two primary methods for handling deadlocks:
Preventing deadlocks involves employing strategies and protocols to eliminate one or more necessary conditions for deadlock. The focus is on structuring the system in a way that potential deadlocks cannot occur. Key strategies include:
Deadlock detection involves periodically examining the system's state to identify if a deadlock has occurred. If a deadlock is detected, recovery strategies are implemented to resolve it. Common approaches include:
These methods provide different approaches for managing deadlocks, offering a balance between prevention measures and post-detection recovery strategies.
References