Earlier posts described how you can collect and analyse waits to identify resource limitations and constraints for your SQL Servers. This post outlines a similar approach for identifying workload impact.
In many cases we use server side traces to answer questions such as ‘what is running against my SQL Server ?’. Such questions normally arise as a result of performance issues affecting a front end application. When used with event and column filters SQL traces are very good at identifying specific troublesome queries. Extended events, as an alternative to traces, also meet this requirement. But to answer a very high level question such as ‘has my workload on this server increased or decreased ?’ using traces you’d have to capture practically all queries and analyse the results using tools such as SQL Nexus and RML utilities.
A much easier way to answer overall workload questions is to collect and analyse the query statistics that SQL Server automatically saves for you. They can be found in the sys.dm_exec_query_stats table. By sampling this table at regular intervals you can plot trends for execution counts, worker times, durations, reads, writes, row counts etc. The statistics are available at the statement level granularity. This means you have performance data for ad hoc statements, prepared statements and the individual statements for stored procedures, functions and triggers. By saving historic figures at this level of granularity you can answer questions not just at the overall workload level but also at the individual statement level. For example, a question such as ‘has the performance of a specific statement in a stored procedure degraded over time ?’
There are caveats to using sys.dm_exec_query_stats. Because the statistics are associated with cached plans certains executions will not have their statistics stored. For example, last_execution_time, execution_count etc won’t be updated if you execute a stored procedure WITH RECOMPILE. Certain commands will not have query statistics entries – BEGIN TRAN, ROLLBACK TRAN, DBCC, BACKUP etc. The only way to deal with these missing entries is to capture the entire workload using traces. The downside is the potentially tens of gigabytes worth of trace files that can be generated for busy systems. So, using query statistics for the analysis we trade off some of the commands we may miss for the saving in disk space and overhead of processing a potentially large number of trace files.
Gauging the impact of missing commands is fairly straightforward. You compare a trace analysis with a query statistics analysis. If total reads, total duration, total cpu from a trace analysis match the corresponding totals from a query statistics analysis then for the given workload you’ve validated the query statistics approach.
Having set the background we’ll move on to describe one approach to collecting and analysing data from sys.dm_exec_query_stats.
In SQL 2008 and above we have the query_hash column. Grouping on the query_hash, statement start offset and statement end offset we can aggregate the values for execution count, total work time, last reads etc to measure the workload hitting your SQL Server. To minimise the impact of statistics collection the approach listed here does not fetch the actual query text. This can easily be obtained once the overall workload picture is presented.
Just as for the collection of wait, latch and spinlock statistics (see early post), the approach here is to collect the query statistics from target servers and store the data in a warehouse on another server. Compared to the earlier post the only dimension table that changes is the dimServer table. We add a column to store the date and time that query statistics were last fetched. We avoid resampling the whole data by making use of this column. Here is the modified dimServer table:
-- Server table IF OBJECT_ID('dbo.dimServer','U') IS NOT NULL DROP TABLE dbo.dimServer; GO CREATE TABLE dbo.dimServer ( ServerID int NOT NULL IDENTITY(1,1), ServerName sysname NOT NULL, QueryStatsLastFetch datetime NOT NULL CONSTRAINT df_dimServer_QueryStatsLastFetch DEFAULT ('19900101') CONSTRAINT pk_dimServer PRIMARY KEY CLUSTERED (ServerID), CONSTRAINT uq_dimServer UNIQUE NONCLUSTERED (ServerName) ); -- Pre-populate with your list of servers INSERT INTO dbo.dimServer (ServerName) SELECT CAST(SERVERPROPERTY('ServerName') AS sysname);
The procedure to get and set the QueryStatsLastFetch column is here:
IF OBJECT_ID('dbo.procServerQueryStatsLastFetch','P') IS NOT NULL DROP PROCEDURE dbo.procServerQueryStatsLastFetch; GO CREATE PROCEDURE [dbo].[procServerQueryStatsLastFetch] @GetSet char(3), @ServerName sysname, @QueryStatsLastFetch datetime OUTPUT AS BEGIN SET NOCOUNT ON IF UPPER(@GetSet) = 'SET' BEGIN UPDATE dbo.dimServer SET QueryStatsLastFetch = @QueryStatsLastFetch WHERE ServerName = @ServerName END IF UPPER(@GetSet) = 'GET' BEGIN SELECT @QueryStatsLastFetch = QueryStatsLastFetch FROM dbo.dimServer WHERE ServerName = @ServerName END RETURN END;
The query_hash, statement_start_offset and statement_end_offset are stored in another dimension table and are referenced through the QueryID column:
IF OBJECT_ID('dbo.dimQuery','U') IS NOT NULL DROP TABLE dbo.dimQuery; GO CREATE TABLE dbo.dimQuery ( QueryID int IDENTITY(1,1) NOT NULL, ServerID int NOT NULL, query_hash binary(8) NOT NULL, statement_start_offset int NOT NULL, statement_end_offset int NOT NULL, CONSTRAINT pk_dimQuery PRIMARY KEY CLUSTERED (QueryID), CONSTRAINT uq_dimQuery UNIQUE NONCLUSTERED (ServerID,query_hash,statement_start_offset,statement_end_offset) );
The query statistics are stored in this fact table:
IF OBJECT_ID('dbo.factQueryStats','U') IS NOT NULL DROP TABLE dbo.factQueryStats; GO CREATE TABLE dbo.factQueryStats( ServerID int NOT NULL, DateID int NOT NULL, TimeID int NOT NULL, QueryID int NOT NULL, query_hash_count int NULL, plan_generation_num bigint NULL, creation_time datetime NULL, last_execution_time datetime NULL, execution_count bigint NULL, total_worker_time bigint NULL, last_worker_time bigint NULL, min_worker_time bigint NULL, max_worker_time bigint NULL, total_physical_reads bigint NULL, last_physical_reads bigint NULL, min_physical_reads bigint NULL, max_physical_reads bigint NULL, total_logical_writes bigint NULL, last_logical_writes bigint NULL, min_logical_writes bigint NULL, max_logical_writes bigint NULL, total_logical_reads bigint NULL, last_logical_reads bigint NULL, min_logical_reads bigint NULL, max_logical_reads bigint NULL, total_clr_time bigint NULL, last_clr_time bigint NULL, min_clr_time bigint NULL, max_clr_time bigint NULL, total_elapsed_time bigint NULL, last_elapsed_time bigint NULL, min_elapsed_time bigint NULL, max_elapsed_time bigint NULL, total_rows bigint NULL, last_rows bigint NULL, min_rows bigint NULL, max_rows bigint NULL, CONSTRAINT pk_factQueryStats PRIMARY KEY CLUSTERED (ServerID,QueryID,DateID,TimeID) );
To complete the setup we have the table type and stored procedure that inserts into the fact table:
-- Drop the stored procedure that use the table valued parameter IF OBJECT_ID('dbo.procInsertQueryStats','P') IS NOT NULL DROP PROCEDURE dbo.procInsertQueryStats; GO -- Table type for table valued parameter IF EXISTS (SELECT * FROM sys.types WHERE name = 'typeQueryStats') DROP TYPE dbo.typeQueryStats; GO CREATE TYPE dbo.typeQueryStats AS TABLE ( ServerName sysname NOT NULL, SampleDateTime datetime NOT NULL, query_hash binary(8) NOT NULL, statement_start_offset int NOT NULL, statement_end_offset int NOT NULL, query_hash_count int NULL, plan_generation_num bigint NULL, creation_time datetime NULL, last_execution_time datetime NULL, execution_count bigint NULL, total_worker_time bigint NULL, last_worker_time bigint NULL, min_worker_time bigint NULL, max_worker_time bigint NULL, total_physical_reads bigint NULL, last_physical_reads bigint NULL, min_physical_reads bigint NULL, max_physical_reads bigint NULL, total_logical_writes bigint NULL, last_logical_writes bigint NULL, min_logical_writes bigint NULL, max_logical_writes bigint NULL, total_logical_reads bigint NULL, last_logical_reads bigint NULL, min_logical_reads bigint NULL, max_logical_reads bigint NULL, total_clr_time bigint NULL, last_clr_time bigint NULL, min_clr_time bigint NULL, max_clr_time bigint NULL, total_elapsed_time bigint NULL, last_elapsed_time bigint NULL, min_elapsed_time bigint NULL, max_elapsed_time bigint NULL, total_rows bigint NULL, last_rows bigint NULL, min_rows bigint NULL, max_rows bigint NULL ); GO CREATE PROCEDURE dbo.procInsertQueryStats @NearestSeconds int, @tblStats typeQueryStats READONLY AS BEGIN SET NOCOUNT ON; DECLARE @ServerID int; DECLARE @ServerName sysname; DECLARE @SampleDateTime datetime; DECLARE @QueryStatsLastFetch datetime; DECLARE @DateID int; DECLARE @TimeID int; -- Summary dimension data from stats table SELECT TOP 1 @ServerID = ds.ServerID, @ServerName = ds.ServerName, @SampleDateTime = s.SampleDateTime, @QueryStatsLastFetch = s.last_execution_time FROM @tblStats s INNER JOIN dbo.dimServer ds ON ds.ServerName = s.ServerName ORDER BY last_execution_time DESC; -- Normalise the sample time - round off to nearest seconds SET @SampleDateTime = DATEADD(second,@NearestSeconds*(DATEPART(second,@SampleDateTime)/@NearestSeconds),CONVERT(varchar(40),@SampleDateTime,100)); -- Date and time IDs SELECT @DateID = DateID FROM dbo.dimDate WHERE DateValue = CAST(@SampleDateTime AS date); SELECT @TimeID = TimeID FROM dbo.dimTime WHERE TimeValue = CAST(@SampleDateTime AS time); -- Insert queries that are new since last collection INSERT INTO dimQuery(ServerID,query_hash,statement_start_offset,statement_end_offset) SELECT @ServerID,query_hash,statement_start_offset,statement_end_offset FROM @tblStats s WHERE NOT EXISTS ( SELECT * FROM dimQuery dq WHERE dq.ServerID = @ServerID AND dq.query_hash = s.query_hash AND dq.statement_start_offset = s.statement_start_offset AND dq.statement_end_offset = s.statement_end_offset ); INSERT INTO dbo.factQueryStats SELECT @ServerID, @DateID, @TimeID, dq.QueryID, s.query_hash_count, s.plan_generation_num, s.creation_time, s.last_execution_time, s.execution_count, s.total_worker_time, s.last_worker_time, s.min_worker_time, s.max_worker_time, s.total_physical_reads, s.last_physical_reads, s.min_physical_reads, s.max_physical_reads, s.total_logical_writes, s.last_logical_writes, s.min_logical_writes, s.max_logical_writes, s.total_logical_reads, s.last_logical_reads, s.min_logical_reads, s.max_logical_reads, s.total_clr_time, s.last_clr_time, s.min_clr_time, s.max_clr_time, s.total_elapsed_time, s.last_elapsed_time, s.min_elapsed_time, s.max_elapsed_time, s.total_rows, s.last_rows, s.min_rows, s.max_rows FROM @tblStats s INNER JOIN dbo.dimQuery dq ON dq.ServerID = @ServerID AND dq.query_hash = s.query_hash AND dq.statement_start_offset = s.statement_start_offset AND dq.statement_end_offset = s.statement_end_offset -- Update the last collection time EXEC dbo.procServerQueryStatsLastFetch 'SET',@ServerName,@QueryStatsLastFetch OUTPUT END; GO
Similarly to the case for wait, latch and spinlock statistics, there are many ways by which we could populate the fact table. A Powershell job step using the following script is one possibility:
# Query stats # Get last collection datetime from data warehouse server $targetserver = "DWHSRV" $dbname = "DBAdwh" $procname = "dbo.procServerQueryStatsLastFetch" $getset = "GET" $servername = "TARGETSRV" [System.DateTime]$querystatslastfetch = "01 January 1990" $conn = New-Object System.Data.SqlClient.SqlConnection("Data Source=$targetserver;Initial Catalog=$dbname;Integrated Security=SSPI;") $conn.Open() $cmd = New-Object System.Data.SqlClient.SqlCommand($procname,$conn) $cmd.CommandType = [System.Data.CommandType]'StoredProcedure' $cmd.Parameters.Add("@GetSet",$getset) | Out-Null $cmd.Parameters.Add("@ServerName",$servername) | Out-Null $cmd.Parameters.Add("@QueryStatsLastFetch",$querystatslastfetch) | Out-Null $cmd.Parameters["@QueryStatsLastFetch"].Direction = [System.Data.ParameterDirection]::Output $cmd.ExecuteNonQuery() | Out-Null [System.DateTime]$querystatslastfetch = $cmd.Parameters["@QueryStatsLastFetch"].Value $conn.Close() $conn.Dispose() # Fetch stats from target server $targetserver = "TARGETSRV" $sql = @" EXEC sp_executesql N'SELECT CAST(SERVERPROPERTY(''ServerName'') AS sysname) AS ServerName,GETDATE() AS SampleDateTime, query_hash, statement_start_offset, statement_end_offset, COUNT(*) AS query_hash_count, MAX(plan_generation_num) as plan_generation_num, MAX(creation_time) as creation_time, MAX(last_execution_time) as last_execution_time, SUM(execution_count) as execution_count, SUM(total_worker_time) as total_worker_time, MAX(last_worker_time) as last_worker_time, MIN(min_worker_time) as min_worker_time, MAX(max_worker_time) as max_worker_time, SUM(total_physical_reads) as total_physical_reads, MAX(last_physical_reads) as last_physical_reads, MIN(min_physical_reads) as min_physical_reads, MAX(max_physical_reads) as max_physical_reads, SUM(total_logical_writes) as total_logical_writes, MAX(last_logical_writes) as last_logical_writes, MIN(min_logical_writes) as min_logical_writes, MAX(max_logical_writes) as max_logical_writes, SUM(total_logical_reads) as total_logical_reads, MAX(last_logical_reads) as last_logical_reads, MIN(min_logical_reads) as min_logical_reads, MAX(max_logical_reads) as max_logical_reads, SUM(total_clr_time) as total_clr_time, MAX(last_clr_time) as last_clr_time, MIN(min_clr_time) as min_clr_time, MAX(max_clr_time) as max_clr_time, SUM(total_elapsed_time) as total_elapsed_time, MAX(last_elapsed_time) as last_elapsed_time, MIN(min_elapsed_time) as min_elapsed_time, MAX(max_elapsed_time) as max_elapsed_time, 0, 0, 0, 0 FROM sys.dm_exec_query_stats WHERE last_execution_time > @querystatslastfetch GROUP BY query_hash, statement_start_offset, statement_end_offset', N'@querystatslastfetch datetime', '$querystatslastfetch' "@ $conn = New-Object System.Data.SqlClient.SqlConnection("Data Source=$targetserver;Initial Catalog=master;Integrated Security=SSPI;") $adapter = New-Object System.Data.SqlClient.SqlDataAdapter($sql,$conn) $dataset = New-Object System.Data.DataSet $adapter.Fill($dataset) | Out-Null $conn.Dispose() # Save results to data warehouse $targetserver = "DWHSRV" $dbname = "DBAdwh" $procname = "dbo.procInsertQueryStats" $nearestseconds = 60 $conn = New-Object System.Data.SqlClient.SqlConnection("Data Source=$targetserver;Initial Catalog=$dbname;Integrated Security=SSPI;") $conn.Open() $cmd = New-Object System.Data.SqlClient.SqlCommand($procname,$conn) $cmd.CommandType = [System.Data.CommandType]'StoredProcedure' $cmd.Parameters.Add("@NearestSeconds",$nearestseconds) | Out-Null $cmd.Parameters.Add("@tblStats",$dataset.tables) | Out-Null $cmd.ExecuteNonQuery() | Out-Null $conn.Close() $conn.Dispose()
Notice in the sys.dm_exec_query_stats prepared statement I’m setting the total_rows, last_rows, min_rows,max_rows columns to zero. These columns are only available in SQL 2008 R2 and higher. The code can be changed to dynamically include or exclude these columns based on the target server being examined.
For a busy system a potentially large number of rows will be returned for each collection. It is important to put in place a purge routine (scheduled job) for the removal of historic data. The retention period can be tailored for each target server by mapping this attribute to a new column in the dimServer table. The following script takes a simpler, ‘one retention period for all servers’ approach:
SET NOCOUNT ON DECLARE @RetainDays int SET @RetainDays = 7 DECLARE @MinDateID int SET @MinDateID = CONVERT(varchar(8),DATEADD(day,-1*@RetainDays,GETDATE()),112) SELECT 'Remove statistics less than or equal to ',@MinDateID SELECT 'Deleting historic QueryStats...' WHILE @@ROWCOUNT <> 0 BEGIN DELETE TOP (1000) FROM dbo.factQueryStats WHERE DateID <= @MinDateID END SELECT 'Deleting Queries which have no QueryStats...' WHILE @@ROWCOUNT <> 0 BEGIN DELETE TOP (1000) q FROM dbo.dimQuery q WHERE NOT EXISTS ( SELECT * FROM dbo.factQueryStats qs WHERE qs.QueryID = q.QueryID ) END
In terms of presentation of the collected data, Powerpivot once again is the best tool. All tables with the exception of the dimQuery are imported into Powerpivot as tables. The varbinary query_hash column in dimQuery generates a data type error if the import table option is chosen. To overcome this create the following view in the data warehouse and import the data through it:
CREATE VIEW dbo.vwdimQuery AS SELECT QueryID, ServerID, master.sys.fn_varbintohexstr(query_hash) AS query_hash, statement_start_offset, statement_end_offset FROM dbo.dimQuery
After creating the appropriate relationships between the fact and dimension ID columns the final data model should appears as this:
Below I chart the historic execution count. We can see it increasing for QueryID 14. Note, I stopped my SQL Server at about 13:25 and restarted at about 19:30. The DMV statistics are reset when SQL Server is restarted:
As mentioned early, to minimise the impact on the target server the collection process does not fetch the actual query text. Once you’ve identified a query of interest you use its QueryID to fetch the query_hash from dimQuery. You then examine the target server’s DMV to get the full query details (the text, the plan, the plan attributes etc).
If you’re only interested in statistics at the procedure level and not at the individual statement level then a similar collection and analysis based on sys.dm_exec_procedure_stats can be performed.
This simple approach to collecting and analysing query statistics will reveal interesting details about your SQL Server workloads. Regular patterns will emerge. Together with the wait, latch and spinlock analysis you should now have a complete picture of your SQL Servers.