Archive for category Monitoring

Meaningful results from cumulative statistics

From a monitoring and troubleshooting perspective we’re all probably familiar with the dynamic management objects referred to in these very simple select statements:

select @@servername as servername,getdate() as sampletime,* from sys.dm_os_wait_stats
select @@servername as servername,getdate() as sampletime,* from sys.dm_os_latch_stats
select @@servername as servername,getdate() as sampletime,* from sys.dm_os_spinlock_stats

select @@servername as servername,getdate() as sampletime,* from sys.dm_io_virtual_file_stats(null,null)

select @@servername as servername,getdate() as sampletime,* from sys.dm_db_index_operational_stats(null,null,null,null)
select @@servername as servername,getdate() as sampletime,* from sys.dm_db_index_usage_stats

These queries return cumulative statistics since SQL Server was last restarted, since statistics were last manually cleared or since index metadata type actions were last run. Plotting cumulative figures over sample time we’d see steadily increasing slopes. Sudden increases in slope would indicate a point of interest. For example, sudden increase in share level lock wait time indicating blocking. These types of points of interest would be more easily revealed if we were to plot the differences or deltas between consecutive sample times. In many blog posts about cumulative statistics (for example, https://sqlscope.wordpress.com/2012/04/09/analysing-wait-type-latch-and-spinlock-statistics/) we see the use of SQL constructs such as windowing functions (LAG), subqueries, CROSS APPLY to calculate these deltas. Although these approaches are valid relying on SQL constructs can be limiting. You’re using SQL Server to calculate the deltas for every column that you wish to graph.

If a difference or delta calculating function existed in your reporting tool then the same generic approach could be applied to all your cumulative statistics columns rather than you having to hand roll SQL to do the calculations. In effect you are offloading the business intelligence function to your reporting tool.

The reporting tool that I tend to use for ad hoc reporting and charting is Excel. In terms of this product’s functionality we are all probably familiar with the simple line charts and possibly the pivot tables and charts that show sum, min, max, average aggregations. One option that exists for the aggregations is to show the summarised values based on a calculation. One of the calculations that’s available is the difference from previous value:

diffprevvalue1

When the difference from the previous SampleTime base field is chosen we see this as the resulting chart:

diffprevvalue2

 

Moving WaitTypeName to the Legend Series you can now make use of the TOP N filter to show the highest waits (here I’ve not filtered out all the system level waits, in a real world analysis you’d filter these out in the SELECT statement):

 

diffprevvalue3

 

Leave a comment