Archive for category Database consistency check

DBCC CheckDB internal snapshot size

There are various ways to see the on-disk size for manually created database snapshots . For example, sys.dm_io_virtual_file_stats returns a size_on_disk_bytes column. You can monitor this column to see the effect of ‘copy on write’ for the sparse files that make up the database snapshot. One type of snapshot that we potentially would like to see the details for would be the internal snapshots that are created when DBCC CheckDB or DBCC CheckTable are run. We know that the DBCC internal snapshot files are created in the root of each data file’s home volume. We know that we have to provision enough space for the size of these DBCC internal snapshot files. Unfortunately, sys.dm_io_virtual_file_stats does not show details for these files. We know that the size of these files will be dependent on the amount of write activity of the checked database. CheckDBs against large databases can run for quite some time. Concurrent user activity could give rise to large DBCC internal snapshots as a result of the ‘copy on write’ effect. In an ideal world we’d like to move the CheckDB operational maintenance window so that internal snapshots are as small as possible. Without knowing the exact size of the snapshots this becomes quite difficult.

One simple but indirect way of examining the size of the DBCC internal snapshots is to monitor the reduction in free space for each of the data file’s volumes (assuming the data files are not growing). The reduction would be equivalent to the size of the hidden snapshot files.

Since SQL Server itself exposes no figures for the DBCC internal snapshot files the functionality within Windows was examined to try to identify anything related to the size of compressed or sparse files. Doing searches a very good reply by Jeffrey Snover [MSFT] was found in this StackOverflow thread:

This Powershell snippet was include in the following script:

add-type -type  @'
using System;
using System.Runtime.InteropServices;
using System.ComponentModel;

namespace Win32Functions{

public class ExtendedFileInfo
[DllImport("kernel32.dll", SetLastError=true, EntryPoint="GetCompressedFileSize")]
static extern uint GetCompressedFileSizeAPI(string lpFileName, out uint lpFileSizeHigh);

public static ulong GetCompressedFileSize(string filename)
uint high;
uint low;

low = GetCompressedFileSizeAPI(filename, out high);

int error = Marshal.GetLastWin32Error();

if (high == 0 && low == 0xFFFFFFFF && error != 0)
throw new Win32Exception(error);
return ((ulong)high << 32) + low;

$datafiles = @("\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf","\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf","\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf")

while (1)
Start-Sleep -Seconds 1

$thistime = (Get-Date).DateTime

$streams = $datafiles | foreach-object { $_ + ":" + (get-item $_ -stream * | where-object {$_.Stream -ne ':$DATA'}).Stream }

$streams | where-object {$_ -match 'MSSQL_DBCC'} | foreach-object { $size = [Win32Functions.ExtendedFileInfo]::GetCompressedFileSize($_); "$thistime,$size,$_" }  | out-file -Append "c:\temp\dbccstreams.txt"

The $datafiles array holds the names of the database data files. Inside the while loop we enumerate any NTFS streams, excluding the main stream ($DATA), that exist for these data files. The script uses the get-item -stream parameter that is new in Powershell 3.0 (I recommend this be installed on your monitoring workstation or server and not your production SQL Server). The GetCompressedFileSize function is used to enumerate the size in bytes of each stream. The while loop samples the sizes every 1 second.

In a typical execution you would run the Powershell script on your monitoring server, you’d then invoke DBCC CheckDB on your SQL Server. When this completes you’d hit control-break on the Powershell script to exit and then finally examine the text results in the c:\temp\dbccstreams.txt file.

You can see from the hard-coded elements of the script that my database consisted of three data files on three separate volumes (S:, U: and V:) The data files were sized at 1024MB each. The database consisted of one very simple table with sample data as follows:

-- Create test table. Note, it's a bad idea to cluster on NEWID(). This is done purely for testing.
CREATE TABLE dbo.TestTable
val char(100) NOT NULL DEFAULT REPLICATE('abcd',25)

-- Populate with data

DECLARE @i int = 1

WHILE (1=1)

IF (@i % 100 = 0)
-- Exit if database is over 40% full
IF (SELECT SUM(FILEPROPERTY(name,'spaceused')/128.0)/(SUM(size)/128.0) FROM sys.database_files WHERE type_desc <> 'LOG') > 0.40 BREAK

SET @i = @i + 1


When DBCC CheckDB was run against this data with no concurrent user activity the output of the Powershell monitoring script was as follows:

01 January 2013 09:00:59,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:00:59,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:00:59,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:00,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:00,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:00,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:01,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:01,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:01,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:02,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:02,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:02,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:03,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:03,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:03,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:04,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:04,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:04,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:05,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:05,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:05,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:06,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:06,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:06,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:07,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:07,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:07,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:08,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:08,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:08,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:01:10,131072,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:01:10,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:01:10,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8

Charting this data in a pivot chart in Excel we see this:


You can see the name format of the DBCC internal snapshot files (streams, to be precise) and the fact that they are always a multiple of 64K in size.

Now for DBCC CheckDB with concurrent user activity (in my case, with just one connection running the population of dbo.TestTable with default values up from 40% to 60% database space used) the following results were obtained running the Powershell monitoring script:

01 January 2013 09:07:36,65536,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:36,65536,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:36,65536,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:37,3080192,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:37,2555904,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:37,2752512,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:38,3473408,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:38,3145728,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:38,3276800,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:39,3735552,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:39,3997696,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:39,3538944,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:40,4390912,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:40,4456448,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:40,4128768,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:41,4718592,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:41,5242880,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:41,4587520,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:42,5373952,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:42,5898240,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:42,5177344,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:43,5636096,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:43,6684672,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:43,5701632,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:44,5898240,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:44,7143424,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:44,6160384,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:45,6553600,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:45,7995392,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:45,6815744,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:47,6881280,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:47,8388608,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:47,7536640,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:48,7340032,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:48,9568256,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:48,8192000,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8
01 January 2013 09:07:49,8978432,\\SQLVSERVER1\s$\MSSQL\Data\MyCheckDBTest_data1.mdf:MSSQL_DBCC8
01 January 2013 09:07:49,11272192,\\SQLVSERVER1\u$\MSSQL\Data\MyCheckDBTest_data2.ndf:MSSQL_DBCC8
01 January 2013 09:07:49,10092544,\\SQLVSERVER1\v$\MSSQL\Data\MyCheckDBTest_data3.ndf:MSSQL_DBCC8

Pivot charting this in Excel shows:

The vertical axis is size in bytes so you can see that for my very small database we’re not seeing huge snapshot files. Again, the sizes are always multiples of 64K. We can see the increase in size on all three data files as a result of the ‘copy on write’ effect.

Running the Powershell monitoring script against larger databases with high volumes of concurrent update activity will generate snapshot file sizes in the hundreds of megabyte to gigabyte range. If your execution of DBCC CheckDB is scheduled then you may observe smaller snapshot file sizes by shifting the start time to a different value. When you originally set the schedule for the DBCC execution what you assumed was a quiet time for your database may well have been a busy time.








Leave a comment