Key SQL Performance Situations to Watch Out For - Part One of Three
View Part Two - SQL Server Expressions
View Part Three - Full Table Scans
SQL Server is a very deep, robust software package, as are most enterprise class DBMS applications. There are many areas of functionality that work very well and are self tuning. However, with that much functionality at the disposal of the developer community, there are also many ways that features can be overused, misused, or abused. That being stated, there isn’t enough time with a single article to cover every possible conceivable performance situation that could arise. So, this article will attempt to focus on 3 of the most common SQL performance related issues that one will encounter in heavily loaded applications.
SQL Server Blocking
Have you ever parked your car in a garage or parking space, left your car to go to your destination of choice, then returned to your car only to find that someone has double-parked behind your car and boxed your car in with their S-U-ber-Hum-V? Your options at that point are A) call security or police to have the vehicle towed B) wait for the person who double parked behind you and politely ask them to move (while imagining your hands around their neck, shaking them violently) or C) hope your insurance company covers more damage to your vehicle than a Moraceae Ficus plant falling on it would cause and attempt to move the S-U-ber-Hum-V with your sub-compact-mini-3-wheel-go-cart of a car. In all three cases, you are hindered from being able to leave immediately to do what you had intended to do.
If this has never happened to you (you lucky dog), we can look at another example. You go to the supermarket to pick up some milk for your cold cereal breakfasts that you’ve had every morning since you were 3. You get the milk and go straight to the checkout line. You can’t leave because there is only one checkout line and it is the looooooooooongest you’ve seen.
Both of these situations have something in common. There was an activity going on, and it was disrupted in the middle by another activity controlled by circumstances outside of your control, but forcing you to wait for an indefinite amount of time until it is finished and you can resume your activities and finish your task. That is an example of blocking.
In SQL Server, blocking takes the form of one transaction being paused by the query processor because the query in question is trying to access a resource that is already in use. This causes the transaction to wait, for an indefinite amount of time for the resource to free up so that it can continue with its normal activity. Sometimes this shows up as unexplainable “phantom slowness” that comes and goes with no real tell-tale signs as to the cause. Hardware resources look fine (disk, processor, memory, etc.), and yet the application is slow.
Here is an example. Assume there is a Customer table in your database. There is a query that your order system is executing to pull the current customers name and address like so.
SELECT CustName, AddressLine1, City, State, ZipCode
FROM dbo.Customer
WHERE CustomerID = 12345
Sounds good right? You have a good index (clustered… only the best will do for you) on the CustomerID column and the cost is very low. However, when a ba-zillion users are hitting your website, this query is taking forever. What did you do wrong? The query works perfectly in your test environment. So what is going on? Will more hardware fix it?
This situation occurs a lot. Especially with applications under heavy load, or where the same database is used for reporting while moderate to heavy load is placed on the system. In the situation above, let’s add that some manager ran one of your intensive reports of customer order totals right during the peak load of the day. Since the report hits every single customer record and takes a while to process all of the totals and number crunching, all attempts to access the customer table are blocked during its execution.
So what do we do about this? Redesign the whole application? Not necessarily. Some cleverly placed hints or indexes could solve the problem. But how do you know what queries to apply the hints to or where to look?
SQL Server comes with some system tables and functions that can assist you in finding problems with blocking. The first SQL Server object you can use is the sysprocesses table. This system table lists all of the active sessions that are connected to the SQL Server as well as general information about the sessions, such as the client that is connected, what credentials it is using, and what program it is. Also in this table are several columns that can be used to isolate the source of the blocking and what pieces of the system are being blocked.
The first column that we will look at is the spid column. This identifies each session that is connected to the SQL Server. The second column that we will look at is the blocked column. This represents the spid of the session that is currently blocking the current row in the sysprocesses table. A value of 0 in this column represents that no blocking has occurred for the current session. Sometimes you might see a value in the blocked column that has the same value as the spid column, making it appear that the session is blocking itself. This isn’t really the case, and has occurred since SP4 of SQL 2000. When this situation occurs, normally it is because the current transaction is waiting, but not on another transaction. Most of the time when this occurs, the transaction is waiting on a hardware resource that is busy, such as a disk activity or network transfer.
In order to find rows that are being blocked and the “lead” blockers (transactions that are blocking all others), the following queries can be used respectively.
Select A.*
From dbo.sysprocesses as A with (nolock)
Where A.blocked <> 0
And A.blocked <> A.spid
Select A.*
From dbo.sysprocesses as A with (nolock)
Where A.blocked = 0
And A.spid in (Select B.blocked
From dbo.sysprocesses as B with (nolock)
Where B.blocked <> 0
And B.blocked <> B.spid)
This will find the rows corresponding to sessions that are causing blocking. However, how do you fix the blocking problem? How do you identify which query is causing the problem and what would you do with it once you find it?
By using the queries mentioned before, one can locate the sessions involved in the blocking. The following query can then be used to find what query is running and is involved. Simply change the value of “54” to equal the spid value involved with the blocking.
declare @spid int;
declare @sql_handle binary(20);
set @spid = 54
SELECT @sql_handle = sql_handle
FROM sysprocesses As A with (nolock)
where spid = @spid
select text
from ::fn_get_sql(@sql_handle)
It is recommended to create a process that is scheduled periodically (like every 10 seconds or so) that will loop through all of the blocking sessions (both being blocked and causing the blocking) and capture their information and their SQL and store all of it in a database table. The reason this is advantageous is that it would act as sort of a black box flight recorder for the blocking data. It can help you find blocking issues night or day related to your systems.
Now then, to solve blocking problems, one can attack them in a few different ways from different directions. One can try to resolve the problem from the angle of the blocked transaction, or from the blocking transaction or both. One can attack the problem by optimizing the queries to make them run faster by reading less data. This may seem obvious, the less data that is processed allows the query to run faster, but it doesn’t hurt to review the queries involved. If they access less data when they execute, they won’t block other transactions as long because the odds of them using the data pages that another transaction is using is reduced. Another, and more common, resolution is to adjust the locking level of the transaction.
By default, SQL Server locks data that is being read or written to during the duration of a transaction. A transaction could be a single SQL Statement, or if a transaction is explicitly started with a “begin transaction” statement, the data will be locked until the transaction is explicitly closed with a “commit/rollback” statement. This is so that other transactions won’t be able to read data that is in the middle of being written to. This is referred to as transaction isolation. The isolation level setting that is the default for SQL Server is “Read Committed”.
Most queries executed on a SQL Server are SELECT statements and read the data only. There are applications that are exceptions to this, but as a general rule this is so. Each “SELECT” query involved in blocking must be analyzed to determine if bypassing the isolation would impair the application. Most of the time, a simple locking hint can be placed in the query to allow it to do what is referred to as “dirty reads” and avoid this locking mechanism. This allows the SELECT query to read data whether it is involved in another transaction or not. To do this, one would use the NOLOCK query hint.
The following is an example of how to use this hint in a query with multiple tables involved giving you a feel for how the syntax works.
SELECT A.OrderID, A.OrderDate, Sum(B.ItemPrice)
FROM OrderDetail AS C WITH (NOLOCK)
INNER JOIN Product AS B WITH (NOLOCK) ON (C.ProductID = B.ProductID)
INNER JOIN Order AS A WITH (NOLOCK) ON (A.OrderID = B.OrderID)
WHERE OrderID = @OrderNumber;
Notice how the hint comes after the alias, but prior to any joins. Once again, adding this hint will cause the query processor to gather data whether or not it is used by another transaction, thereby eliminating the blocking. Most of the time the “NOLOCK” method is a good way of solving this type of problem.
There are situations, however, when this can cause problems in an application because the application relies on being able to read the latest updated values. For example, consider the following transaction. Note: Having calculated sequence numbers using this method is not recommended, but research shows that this is used in several applications as a common approach, and it illustrates the example needed.
BEGIN TRANSACTION;
DECLARE @intLineNumber int;
SELECT @intLineNumber = MAX(LineNumber) 1
FROM OrderDetail
WHERE OrderID = @OrderNumber;
INSERT INTO OrderDetail
(OrderID, LineNumber)
VALUES (@OrderID, @intLineNumber );
COMMIT;
RETURN @intLineNumber;
--Returned so that the program will know what row was just inserted.
An application that relies on logic similar to this can fail if you allow dirty reads for these types of transactions. In this example, since the @intLineNumber value is used later on to insert rows into the OrderDetail table and this is used to populate a key column (very common practice) then this could create duplicate row errors. This is because the design of this query needs to be isolated to make it work. It requires that only one transaction be able to SELECT the maximum LineNumber column at a time in order to avoid duplicates. These types of transactions cannot be successfully adjusted using NOLOCK hints.
It is recommended that each query involved in blocking be analyzed in this way to determine if a NOLOCK hint is the correct way to fix the blocking without introducing a problem. Report programs hitting transaction databases are a common place to put the NOLOCK hints. Also, “Search Pages” are a common source of blocking, as they commonly pull lots of rows using ad hoc type queries to find the data that the user is searching for prior to allowing them to pull up details about the data or update the data. They are normally a good place to strategically place the NOLOCK hints.
If it is determined that nothing in an entire application requires isolated transactions and could benefit from NOLOCK hints, there is a simpler way of adjusting the application than modifying all of the queries. After connecting to SQL Server, one can execute the following command to change the Isolation Level of the session.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
This setting is essentially the same as turning on NOLOCK hints for every query that is executed by the current session/connection. This can be turned off again and set back to the default by issuing the following command.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
In addition to using this isolation level, SQL Server 2005 has introduced a new feature called “READ_COMMITTED_SNAPSHOT”, which can be turned on as a setting within each database on a SQL Server. This tells the query processor to maintain different versions of the data within the tempdb database so that blocking is prevented, but a transaction can immediately read the “old” data values within the versioning tables while another transaction is updating those values. This makes more work for tempdb and isn’t recommended if you have a high workload already with disks running at or near their I/O capacity. It is mentioned here as an alternative to NOLOCK hints as it can be done for every transaction that hits a particular database, and can be turned on very easily.
As you can see, there are many ways to attack the common transaction blocking problem. Each situation needs to be evaluated appropriately in order to choose the method that best solves the problem without introducing data integrity issues or application problems.
View next article on Performance Situations to Watch Out for SQL Server Expressions.
|