Analysing wait type, latch and spinlock statistics

The previous blog post outlined an approach for collecting wait type, latch and spinlock statistics. This post shows how the collected statistics can be analysed and graphed.

SQL Server retains cumulative figures for waiting task counts and wait times. These figures are reset to zero when SQL Server is restarted or when the DBCC SQLPERF(‘sys.dm_os_xxx_stats’,CLEAR) command is run. These raw cumulative figures can be easily graphed. You’ll see steady increases in most of these as well as sharp increases when resource issues arise. As well as cumulative figures we’d also like to see absolute values for waiting tasks and waiting times. These can be obtained by calculating the delta of the cumulative figures for consecutive sampling times. Using the factWaitStats table from the previous post as an example we could using the SQL Server 2012 LAG function to obtain the deltas:

waiting_tasks_count - LAG(waiting_tasks_count,1,0) OVER (PARTITION BY ServerID,WaitTypeID ORDER BY DateID,TimeID) AS delta_waiting_tasks_count,
wait_time_ms - LAG(wait_time_ms,1,0) OVER (PARTITION BY ServerID,WaitTypeID ORDER BY DateID,TimeID) AS delta_wait_time_ms,
max_wait_time_ms - LAG(max_wait_time_ms,1,0) OVER (PARTITION BY ServerID,WaitTypeID ORDER BY DateID,TimeID) AS delta_max_wait_time_ms,
signal_wait_time_ms - LAG(signal_wait_time_ms,1,0) OVER (PARTITION BY ServerID,WaitTypeID ORDER BY DateID,TimeID) AS delta_signal_wait_time_ms,
TimeID - LAG(TimeID,1,0) OVER (PARTITION BY ServerID,WaitTypeID ORDER BY DateID,TimeID) AS delta_sample_time_sec
FROM dbo.factWaitStats

In versions prior to SQL 2012 you could use CTE’s, views, subqueries, CROSS APPLYs to obtain the deltas.

Having obtained the deltas you can use the relationships between the dimension tables and the fact tables to graph the figures that you are interested in.

Powerpivot is a wonderful analysis tool and can be used to great effect here. You’d import the dimServer, dimDate, dimTime and dimWaitType dimension tables. You’d set the date format of the dimTime TimeValue column to HH:MM (so that you don’t see today’s date appearing along with the time). You’d import the factWaitStats data using the SQL 2012 LAG statement above as the source SQL query. If you’re not using SQL 2012 you’d use the following SQL query to import factWaitStats data…

-- If you swap rownum and partitionnum in the SELECT column list you incur a SORT operation
-- in the query plan
DENSE_RANK() OVER (ORDER BY ServerID,WaitTypeID) AS partitionnum
FROM dbo.factWaitStats;

…and then add the following calculated columns to derive the deltas:

=IF([rownum]=1,0,[waiting_tasks_count]-MAXX(FILTER(factWaitStats,[partitionnum] = EARLIER([partitionnum]) && [rownum]=EARLIER([rownum]) - 1),[waiting_tasks_count]))

=IF([rownum]=1,0,[wait_time_ms]-MAXX(FILTER(factWaitStats,[partitionnum] = EARLIER([partitionnum]) && [rownum]=EARLIER([rownum]) - 1),[wait_time_ms]))

=IF([rownum]=1,0,[max_wait_time_ms]-MAXX(FILTER(factWaitStats,[partitionnum] = EARLIER([partitionnum]) && [rownum]=EARLIER([rownum]) - 1),[max_wait_time_ms]))

=IF([rownum]=1,0,[signal_wait_time_ms]-MAXX(FILTER(factWaitStats,[partitionnum] = EARLIER([partitionnum]) && [rownum]=EARLIER([rownum]) - 1),[signal_wait_time_ms]))

=IF([rownum]=1,0,[TimeID]-MAXX(FILTER(factWaitStats,[partitionnum] = EARLIER([partitionnum]) && [rownum]=EARLIER([rownum]) - 1),[TimeID]))

I won’t go into the details of these DAX statements but you can see they are all of the same form. The only difference in them is one column name. The single partitionnum column saves us from having to type [ServerID]=EARLIER([ServerID]) && [WaitTypeID]=EARLIER([WaitTypeID]) in each definition. In other words, we import partitionnum using our source SQL statement and trade off some very minor storage cost for code simplification.

You then define the relationships between factWaitStats and dimServer, dimDate, dimTime and dimWaitType across ServerID, DateID, TimeID and WaitTypeID respectively. These are prerequisites for charting the data. The final Powerpivot worksheet should look something like this:

You can then use a pivot table and chart to graph the data. Here I’ve selected a day and am charting my WRITELOG waits:

In my case the statistics are read every 60 seconds so the deltas are relative to this sampling interval. You could use calculated measures in the pivot tables and charts to account for the sampling interval.

All sorts of simple dashboards could be created using the approach listed in this post.

Collecting wait type, latch and spinlock statistics and analysing them in this way gives you a true picture of the workloads your SQL Servers have to deal with and the resource limitations they face.

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: