When comparing RDBMS features one of the more discussed features is how the database server handles concurrency; ie how do we make sure that data is handled in such a way that we get high performant throughput but also so users can not view uncommitted changes and so on.

In RDBMS systems there are mainly to different ways of solving this issue:

  • Data locking
  • Versioning

In this article, we will look at these two ways to handle concurrency issues and also see what new features SQL Server 2005 introduces.

Data locking

Locking of data has been SQL Server's preferred way of handling concurrency. While, for example, an update statement executes against some data, that data is being locked and no one else can read it (based upon isolation levels). Then, dependent upon how much data that is affected, SQL Server can decide to use a row-lock, page lock or table lock.

Above, I mentioned isolation levels. The isolation level determines how long a read lock is being held (in SQL Server a write lock is always being held to the end of the transaction). The default isolation level in SQL Server is Read Committed: a read operation can only read committed data. If data is being updated while you read, that data is being locked and you won't be able to view the data until the transaction that updates the data has committed (or rolled back).

From a concurrency standpoint this is very good, you are ensured that you can only read correct data. From a throughput perspective it is not that good because your read operation won't return until the write locks has been released.

In order to enhance the throughput other database vendors are using versioning instead of locking.

Data Versioning

In a system using versioning, data is not being locked but instead read operations happen against an older version of the data being manipulated. It is important to notice that the read operation happens against an earlier committed version of the data, we are not reading uncommitted data. Dependent upon isolation levels, the read operation then either read the latest committed data or the data as it was when the read operation started.

Using versioning seems like the ideal solution; we can always read data (no locking), and we are always reading committed data! Yes, that is absolutely true. However, versioning doesn't come without a cost. First of all, when an update operation takes place against the data, the data being touched is copied to a separate storage area. This incurs a performance penalty, but also be aware of the impact of volume of data being copied; ie, do not use versioning if you are doing large batch updates. Read operations will also incur a performance penalty as you potentially have to read through different versions of the data.

Having looked at the principles behind locking and versioning, let's take a look at what is new in SQL Server 2005.

SQL Server 2005 and Versioning

For developers used to database servers using versioning, SQL Server's locking may seem like a step back. In SQL Server 2005 therefore, Microsoft has included the ability to use versioning in addition to locking!

Versioning in SQL Server 2005 is implemented as two new isolation levels:

  • Read Committed Snapshot
  • Transaction Isolation Level Snapshot

The difference between these two isolation levels is how reads are handled inside the same transaction. For example, if you have a piece of T-SQL code looking like so:

use pubs
go
BEGIN TRAN

select * from authors where au_id = '111-111-1111';
--do some other work --now select from authors again select * from authors where au_id = '111-111-1111';

COMMIT TRAN

...and in between the first and second read from authors, the record we're reading have been updated, the behavior would be different. For read committed snapshot you would see the last committed value, whereas for transaction isolation level snapshot the value would be as it was for the first read.

Having seen an example of this, the question is how do we enable these new isolation levels (they are no on by default)? You are enabling them on a database by database level, and you do it through the ALTER DATABASE syntax.

Read committed snapshot is enabled like so:
use master;
go
ALTER DATABASE pubs
SET READ_COMMITTED_SNAPSHOT ON

Enabling read commited snapshot replaces the default read committed isolation level with read committed snapshot, so in order to use it you do not have to do anything in particular after having enabled it.

Transaction isolation level snapshot is enabled like this:

use master;
go
ALTER DATABASE pubs
SET ALLOW_ISOLATION_LEVEL_SNAPSHOT ON

As opposed to read committed snapshot transaction isolation level snapshot has to be explicitly enabled for the session or statement where you want to use it. The following code snippet shows how to enable it:

use pubs
go
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
BEGIN TRAN

select * from authors where au_id = '111-111-1111';
--do some other work --now select from authors again select * from authors where au_id = '111-111-1111';

COMMIT TRAN

Summary

Versioning in SQL Server 2005 gives the application developer new means to create applications with great throughput for read operations. It also gives SQL Server the same capabilities that competing database systems have had for quite a while.

Be aware, however, that versioning doesn't come without a cost. When enabling versioning in a database and, for example, updating records, the whole record will be copied to TempDb (which is where the version store is) plus an additional 14 bytes. In addition to increased size of TempDb read operations will also be slower, as it has to read against TempDb and potentially through quite a few versions unti it finds the version of the record it wants.