Transactions and Isolation levels

Safety googlesAt work, we have two applications which connect to the same database. For all kinds of business reasons, we need to make sure that only one of the applications accesses certain data at the same time. To do this, we use a row in a table as a semaphore.

While working on the locking mechanism, we had a closer look at the Transaction Management and the Isolation Levels we were using. There is a lot of good documentation on Transaction and Isolation, but it tends to be over complete, elaborate and therefore hard to read. I’ll try to share our insights with you in a slightly more digestable form (I hope).

What is a Transaction
Every (proper) database uses Transactions. A Transaction is a unit of work which you can undo (roll back) or save (commit) to the database. Generally, database clients tend to “Auto Commit” which effectively means that each of your actions on the database are executed immediately and can not be rolled back.

When you click through your client’s settings, you can probably find an option to turn Auto Commit off. Now, at your first database write action (insert or update), a Transaction is started. The database will keep track of each of your write actions, and once you’re finished, you can decide to keep the changes by “committing” the Transaction. You can also decide to revert all the changes you made and do a “rollback”, as long as the Transaction is not closed. Both Commit and Rollback close a transaction, so choose wisely.

What is an Isolation Level
Most databases allow more than one user to connect to the database. In order for you to do your work, you may require a stable database without changes. To protect yourself from seeing other peoples updates, you can set an “Isolation Level”. Generally, you set the isolation level at the time of starting a Transaction.

Isolation levels come in many different flavours, from “liberal” to “strict”:

  • READ UNCOMMITTED – Allows you to see the database as it is right now. Doing he same query on a table twice is likely to produce different results as other people are updating the table, even if they haven’t committed their transactions yet.
  • READ COMMITTED – The database will only show you changes to the table which are commited. Doing the same query twice can still produce different results, but you’re sure that those changes will not be olled back by the authors.
  • REPEATABLE READ – The database will make sure that when you execute the same query twice, you will see the same data twice. It may have to lock the rows in the table to do this, which will prevent other people from updating them until you complete your transaction.
  • SERIALIZABLE – The database will not only make sure you see the same data, but it will also make sure that all updates in the system are done in such a way that it seems as if there was only 1 user connected to the database at a time. This is the most “expensive” isolation level, and tends to be relatively slow.

Transaction is not Isolation.
Although a Transaction may suggest that your changes are not written to the database until after your commit, it will write all changes directly to the tables. That means that anybody can see what you are doing to the database, depending on their Isolation levels. Mind you: their isolation levels, not yours.

You can see a Transaction as your “undo buffer” if you will, and the “Isolation Level” as the filter through which you are looking at other people’s actions to the database.

How about our semaphores?
Back to our semaphore problem. Reading about Transactions and Isolation Levels, we knew that both systems must do the “fetch” (a read and an update) of the semaphore within a single transaction, and both systems must use the “serializable” isolation level to make sure that it reads the correct, current value without having a chance of incorrectly fetching the smaphore.

What’s funny about the serializable isolation level however, is that it does not prevent you from overwriting other people’s values without knowing. Our example:

  1. Application 1 reads the semaphore to be 0 and decides it can set it to 1.
  2. Application 2 reads the semaphore to be 0 and also decides it can set it to 1.
  3. Application 1 sets the semaphore to 1 and thinks it has the lock.
  4. Application 2 also sets the semaphore to 1 and thinks is has the lock.

This is clearly a problem. We learned that:

Regardless of your isolation level or transaction usage, the last commit always wins.

There is, however one database on the market which at least fails in this situation. It’s Oracle. When you use the “Serializable” isolation level, Oracle will prevent you from overwriting a row which was changed without your knowing about it. This changes our scenario as follows:

  1. Application 1 reads the semaphore to be 0 and decides it can set it to 1.
  2. Application 2 reads the semaphore to be 0 and also decides it can set it to 1.
  3. Application 1 sets the semaphore to 1 and thinks it has the lock.
  4. Application 2 tries to set the semaphore to 1, but Oracle notices that the row has changed since the last read, and will fail to do the update. An error (ORA-08177) is thrown at the application, allowing you to re-read the lock and discover it has been taken.

Generally, you will have 1 application connecting to 1 database. The application will use the same Transaction granularity and Isolation levels throughout, and you will not notive these interesting behaviours. As soon as you have multiple applications with different isolation levels connecting to the same databases, you are likely to run into interesting, sporadic irreproducable problems. Try to think about your Isolation strategies.

If I got you interested, I recommend reading this blogpost by Peter Veentjer about this very same thing. Peter is a bit more technically correct than I am in this post.


2 Responses to Transactions and Isolation levels

  1. pveentjer says:

    Hi Rolfje,

    what you could do is a select for update. The select for update automatically places a pessimistic locking on the records read and therefore you don’t need to deal with the optimistic locking failure 08177.

    It is not just Oracle that is ‘suffering’ from this problem. Other MVCC databases like Postgresql, MySQL + InnoDB also have the same issues.

    If you really want to know how Oracle and concurrency control works, I recommend the excellent books of Thomas Kyte. The general rule is: each database is different, and to make effective use of it you need to understand the concurrency control mechanisms of each db.

    Thanks for refering to my post 🙂

    Oracle only supports 2 + 1 isolation level:
    – committed read
    – serialized. This isolation level is not completely serialized btw:
    – read only: that provides real serialised semantics but can’t be used for updates.

  2. rolfje says:

    Hi Peter,

    Yes you are right, and we did indeed go for a select for update in the end. But even with a select for update, some databases still allow the second write action to the row fail, thereby overwriting a value which you didn’t see in the select.

    This story is also a nice example of why you can not just switch databases in an existing application “because you have [framework X] which abstracts the database”. 🙂

    Thanks for calrifying and referring to the “select for update” which made our lives a lot easier in this case.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s