Transaction Concurrency - Optimistic Concurrency Control
The mechanism used to implement concurrency in SQL transaction handling is an issue that affects all database programmers, whether or not they realize it.
The purpose of this article is to present some of the problems caused by database systems that use locking mechanisms and how the solution provided in Mimer SQL avoids all these issues.
In SQL, a transaction represents a logical unit of work; it is an essential concept in database programming. A transaction defines the beginning and end of a series of database operations that are regarded as a single unit. For example, to transfer money between two bank accounts, an amount is subtracted from one account and the same amount is added to the other account. It is essential that either both of these operations succeed or neither does.
From the point of view of other transactions querying the database, a particular transaction has either been successfully committed or it has not. There should be no in-between state where some of the changes are visible; either all the changes are visible or it is as if the transaction is not active.
Concurrency means several transactions being active at the same time. Obviously, there are a number of issues when several transactions are executing at the same time and operating on the same parts of the database. In this paper, we are discussing the mechanisms used to resolve the issue of concurrency.
Concurrency is one of the areas where Mimer SQL has the advantage over other RDBMS products. Mimer SQL uses a method for transaction management called Optimistic Concurrency Control; most other RDBMSs offer pessimistic concurrency control.
Locking Affects Performance
Pessimistic concurrency control protects a user's reads and updates by acquiring locks. Typically, there are read and write locks. Depending on the lock granularity, these locks are held on rows, database pages or even entire tables.
Under pessimistic concurrency control a transaction can obtain a lock only if another transaction does not hold a conflicting lock on the same thing. When a lock is set, other transactions that need to set a conflicting lock are blocked until the lock is released, usually when the transaction is completed. The more transactions that are running concurrently, the greater the probability that transactions will be blocked, leading to reduced throughput and increased response times.
The resource costs involved in lock maintenance are considerable and these costs dramatically escalate as the number and complexity of concurrent transactions increases. However, there is always a large number of incidences where this overhead is totally wasted because the transactions do not actually conflict with one another.
The effects of locking strategies are difficult to predict and the consequences are often only apparent when an application is put into production. Vendors have to provide a vast number of tuning mechanisms to optimize performance which increases the complexity and cost of the development and maintenance of the database system.
The Mimer SQL Solution
Mimer SQL completely eliminates these complicated locking problems and overheads. The Optimistic Concurrency Control method of transaction management used by Mimer SQL does not involve locking and therefore a deadlock can never occur.
Put simply, at the point of transaction commit, the Mimer SQL database server compares the rows that were read during the transaction with the current values in the database. If the rows that were read, and on which the decision to perform updates depended, still have the same values then the transaction can perform the updates.
When a transaction is started under Mimer SQL, a consistent view of the database is frozen based on the state after the last committed transaction. This means that the application will see a totally consistent view of the database during the entire transaction. During the transaction build-up the Mimer system keeps track of objects read during the transaction and builds an intention list of changes to be made, but does not actually apply any of these changes to the database.
The prepare phase starts when the transaction is committed by the application. It involves using the objects read during the transaction to detect conflicts with other transactions. A conflict occurs when another transaction alters data in a way that would modify any of the objects read in the transaction that is being checked. Other transactions that were committed during the checked transaction's build-up phase or during this prepare phase can cause a conflict.
If no conflicts are detected then the commit phase secures the operations in the intention list to disk. Part of this phase is the atomic operation that makes visible all the changes involved in the transaction, even when there are a large number of operations involved. Although background threads update the database, the committed transaction changes are immediately visible to all applications.
If a transaction conflict is detected, the transaction being checked is aborted. Because the intention list contains the changes, no actual modifications have been made to the database and therefore a physical rollback of the database is not required. In this case, the application controls what action is performed. There are times when the transaction can safely be repeated until the commit is successful, other times the new data is re-displayed to the user, prompting them whether to go ahead.
In a pessimistic concurrency control system, it is possible to get into a situation where a number of transactions are waiting on each other, a situation known as a deadlock, where none of the transactions can proceed. When a deadlock occurs the only solution is for one of the blocked transactions to be aborted.
The technique used to determine whether a deadlock has occurred varies but a common method is timeout-based, that is, after a fixed period of time automatically abort any transactions that have been blocked for longer than this length of time (for example, 20 seconds). This implies that a number of users can be waiting for this period of time. Further, there is the possibility that a transaction is victimized even though it is not deadlocked.
Another common method is graph-based detection where blocked situations are tracked and checked for deadlocks. Scalability is an issue with this method; the costs involved in traversing the locking graph rapidly escalate as the number of concurrent transactions increases.
With Mimer SQL, a deadlock can never occur. Through the use of optimistic concurrency control, Mimer SQL is able to avoid problems such as data being left inaccessible because of locks being held over user interactions or client processes failing. It is the long-standing transaction (for example, the user who has gone to lunch) that is penalized under optimistic concurrency control. This is the opposite of the locking situation where everyone else is penalized.
A solution to lock thrashing offered by the vendors to their customers, is to reduce the number of concurrent transactions in the system by restricting the number of server sessions, that is, throttle the throughput of the database server. Another solution programmers have been known to implement to solve the problem of conflicting locks is to reduce the size of the transaction by splitting it into a number of shorter transactions. This may reduce the time that locks are held, but it destroys the all-or-nothing nature of the original transaction, introducing the possibility that a software or hardware failure will cause an inconsistent database.
Programming with Mimer SQL is simplified as transaction aborts only occur at the commit, whereas, in other DBMSs, deadlocks can occur at any point during a transaction. Also, with Mimer SQL, the problems of introducing new applications into a production environment are minimized. In a locking system, a new application can introduce deadlocks and the performance of existing applications may suffer due to blocking.
It should be obvious that, in a locking system, the order of operations in a transaction can affect other concurrent transactions, possibly resulting in a deadlock. However, what is not so obvious is that the SQL optimizer decides the evaluation order within a statement. This may introduce deadlocks in a production system when the optimizer changes the access order. Programmers can code around this by splitting the SQL statement, which may avoid the problem, but this practice no longer allows the SQL optimizer to calculate the optimal join order.
In the case of Mimer SQL, the order that rows are accessed within the transaction is irrelevant to the transaction handling. That is, if the Mimer SQL optimizer changes the join order, it does not have any impact on this transaction or any other concurrent transactions.
When database vendors talk about a data warehouse, this is their solution to the lock conflicts between long-running queries required for decision support and the relatively short operations of a transaction processing system. By creating a copy of the transaction-processing database and calling it a data warehouse, they are merely removing the problem of lock contention between queries and updates by running against different databases.
Mimer SQL supports the concept of Read-Only transactions. When using a read-only transaction, the application will always see a consistent view of the database throughout the transaction, allowing precise information to be provided for decision support. As the consistency of the database is guaranteed, an actual transaction check is not performed and the internal structures used to perform the transaction checks are not required.
With Mimer SQL, long-running queries can execute in parallel with on-line transaction processing. This subject of transaction isolation (repeatable reads) is another area where Mimer SQL offers advantages over its competitors.
Last updated: 2002-05-18