Extended event for inaccurate cardinality estimate

An extended event for inaccurate cardinality estimate is available to those of us using SQL Server 2012. The description for the event makes clear that the event can have significant performance overhead.

It’s best to heed this warning. I’d recommend conducting any analysis on a non-production server.

One of the event data columns that appears for this event is named actual_rows. Testing this functionality for the first time I wasn’t able to make sense of the values returned. I populated my demo table with two rows, ran a stored procedure to select from the table (as normal, the plan is cached after first execution), inserted ninety eight more rows and ran the stored procedure again. The output of the extended event showed me an estimated_rows column with a value of two (which made sense) but the actual_rows wasn’t one hundred or ninety eight. Here’s the result set:

After some research it appears that the actual_rows value represents row count thresholds. This post give a full explanation:

http://blogs.msdn.com/b/ialonso/archive/2012/04/16/how-the-query-execution-engine-calculates-the-threshold-it-uses-to-detect-inaccurate-cardinality-estimation-and-fire-the-inaccurate-cardinality-estimate-extended-event.aspx

According to this post the extended event is fired by a subset of the plan operators (those using the GetRow method) when predefined row count thresholds have been exceeded. This makes sense from a performance impact point of view. The post mentions thresholds based on five times the base cardinality. To confirm this fact the following script was run and the result sets examined for varying numbers of inserts:


IF OBJECT_ID('dbo.demotable','U') IS NOT NULL DROP TABLE dbo.demotable;

CREATE TABLE dbo.demotable
(
id int NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
val char(3) NOT NULL
);

CREATE INDEX idxdemotableval ON dbo.demotable (val);

IF OBJECT_ID('dbo.fetchfromdemotable','P') IS NOT NULL DROP PROCEDURE dbo.fetchfromdemotable;
GO

CREATE PROCEDURE dbo.fetchfromdemotable @val char(3)
AS
BEGIN

DECLARE @id int

SET NOCOUNT ON

SELECT    @id = id
FROM dbo.demotable
WHERE val = @val

END;
GO

-- Insert two rows into the table
INSERT INTO dbo.demotable (val) VALUES ('abc')
GO 2

-- Run the stored procedure for the first time. It will fetch just the 2 rows
-- The query plan shows node_id 0 being the nonclustered index seek on the val column
EXEC dbo.fetchfromdemotable 'abc'
GO

-- Set up the extended event session to catch inaccurate cardinality estimate events
IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'extevInaccurateCardinalityEstimate')
DROP EVENT SESSION extevInaccurateCardinalityEstimate
ON SERVER;
GO

-- Specify your database_id and session_id in the WHERE clause
CREATE EVENT SESSION extevInaccurateCardinalityEstimate
ON SERVER
ADD EVENT sqlserver.inaccurate_cardinality_estimate
(
ACTION (sqlserver.session_id,sqlserver.database_id,sqlserver.sql_text,sqlserver.plan_handle)
WHERE sqlserver.database_id = 8
AND sqlserver.session_id = 54
)
ADD TARGET package0.ring_buffer
(SET max_memory = 4096)
WITH (max_dispatch_latency = 1 seconds);
GO

ALTER EVENT SESSION extevInaccurateCardinalityEstimate
ON SERVER
STATE = START
GO

-- Insert more rows in addition to the 2 that we already have
INSERT INTO dbo.demotable (val) VALUES ('abc')
GO 7        -- one less than first threshold
--GO 8        -- at first threshold
--GO 57        -- one less than second threshold
--GO 58        -- at second threshold
--GO 357    -- one less than third threshold
--GO 358    -- at third threshold
--GO 2157    -- one less fourth than threshold
--GO 2158    -- at fourth threshold
GO

-- Run proc
EXEC dbo.fetchfromdemotable 'abc'
GO

-- estimated_rows will always be 2 as that's what we had when the procedure was first executed
SELECT
event_data.value('(@name)[1]','varchar(50)') AS event_name,
DATEADD(hh,DATEDIFF(hh,GETUTCDATE(),CURRENT_TIMESTAMP),event_data.value('(@timestamp)[1]','datetime2')) AS [timestamp],
event_data.value('(action[@name="sql_text"]/value)[1]','nvarchar(max)') AS sql_text,
event_data.value('(data[@name="node_id"]/value)[1]','int') AS node_id,
event_data.value('(data[@name="estimated_rows"]/value)[1]','int') AS estimated_rows,
event_data.value('(data[@name="actual_rows"]/value)[1]','int') AS actual_rows,
event_data.value('(data[@name="fire_count"]/value)[1]','int') AS fire_count,
event_data.value('xs:hexBinary((action[@name="plan_handle"]/value)[1])','varbinary(64)') AS plan_handle
FROM (
SELECT CAST(target_data AS XML) target_data
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON s.address = t.event_session_address
WHERE s.name = 'extevInaccurateCardinalityEstimate'
AND t.target_name = 'ring_buffer'
) AS Data
CROSS APPLY target_data.nodes ('RingBufferTarget/event') AS p(event_data)
GO

ALTER EVENT SESSION extevInaccurateCardinalityEstimate
ON SERVER
STATE = STOP
GO

-- SELECT * FROM sys.dm_exec_query_plan (0x05000800CCB4027E10B402710100000001000000000000000000000000000000000000000000000000000000)

You run this script in a test database and change the database_id and session_id to appropriate values. The script can be run as a whole. The first run specifies a GO 7. This is for a total row count of 9 for the demo table. You won’t see a result set for the final select indicating that the event has not been fired. The row count threshold has not been reached.

If you comment out the GO 7 and remove the comments for GO 8 and rerun the entire script you’ll see a result set:

Fives times the estimated_rows gives you the threshold seen the in the actual_rows value. We added eight to the already existing two rows so the event fired.

The next threshold is for a total row count of the ten existing rows plus first times the lastest threshold (ten) ie sixty. If you comment out the GO 8 and uncomment GO 57 this will be the case for one less than the next threshold. Run the script in its entirety and you’ll see no change to the result set. We’re under the second threshold so only one event fired. Comment out GO 57 and uncomment GO 58, run the script and you’ll see the second threshold has been met:

The third threshold is the 60 rows we have plus first times the current threshold of 60 ie 360.

The fourth threshold is 360 plus five times 360 ie 2160. Carry on changing the commented out lines and you’ll see the change in the number of events fired. The result set for the final run is as follows:

So the result set now makes sense. Events are fired when row count thresholds are exceeded. The combination of the plan handle and node_id allows you to identify the plan and operator (see the commented out SELECT FROM sys.dm_exec_query_plan at the bottom of the script).

Advertisements
  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: