Archive for category Transaction log

Delayed durability data loss demonstration

This MSDN article, http://msdn.microsoft.com/en-gb/library/dn449490.aspx, makes it clear under the section ‘When can I lose data ? SQL Server shutdown and restart’ that a normal shutdown and restart can result in missing data. That is, client applications may receive acknowledgement that a transaction has been committed but because of the deferred log writer activity the transaction log may not have been persisted to the transaction log file. The following scripts  demonstrate the data loss that you could encounter.

First I’ll show that a database with delayed durability disabled does not suffer data loss. My testing database is named DelayedDurabilityTest. I will be inserting data into a table named MyData:

CREATE TABLE DelayedDurabilityTest.dbo.MyData
(
id BIGINT NOT NULL IDENTITY,
val VARCHAR(255) NOT NULL
)

For the first test I set delayed durability to disabled:

ALTER DATABASE [DelayedDurabilityTest] SET DELAYED_DURABILITY = DISABLED WITH NO_WAIT

Now in one query window run this to start populating the table:

-- Insert script - best to run this using SQLCMD - see the very end of this postSET NOCOUNT ON
WHILE 1=1 
BEGIN
INSERT INTO DelayedDurabilityTest.dbo.MyData (val) VALUES ('Hello !!')
PRINT SCOPE_IDENTITY()
END

 

 

While this is running start a new query window and run this:

SHUTDOWN WITH NOWAIT

 

 

Switch back to the window that was running the inserts and note down the last printed identity value. For my example, this was

...16248
16249
16250
16251
16252
Msg 0, Level 11, State 0, Line 0
A severe error occurred on the current command.  The results, if any, should be discarded.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command.  The results, if any, should be discarded.

 

So in my case the last printed identity value was 16252.

Now restart the SQL Server and run

SELECT COUNT(*) FROM DelayedDurabilityTest.dbo.MyData

 

In my case this returned a value of 16253. That fact that it is one higher than the printed value in the window that was doing the inserts simply means the printing of output was lagging the actual inserting of data. The important point is that there is no data loss.

Now we truncate the table, set delayed durability to FORCED and rerun the inserting window and then the shutdown window:


TRUNCATE TABLE DelayedDurabilityTest.dbo.MyData
<pre>ALTER DATABASE [DelayedDurabilityTest] SET DELAYED_DURABILITY = FORCED WITH NO_WAIT</pre>

This time the output of the inserting window was this:


18554
18555
18556
18557
18558
18559
Msg 109, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server. (provider: Shared Memory Provider, error: 0 - The pipe has been ended.)

So the last printed identity value was 18559. As was previously the case there may well be more rows in the table. The slow output of the printed messages may be under representing the actual rows. However, compare this to the row count in the table when the SQL Server was restarted. It was 18512. So in this case it’s actually lower than the last inserted value. So this has shown that with delayed durability enabled a SQL Server shutdown has resulted in data loss.

When running the scripts it may take several attempts to return a case where the row count is lower than the print message for the last row inserted. Using SQLCMD to run the insert script you’ll have more occurrences where the last printed identity is higher than the actual row count in the table when SQL Server is restarted.


SQLCMD -S localhost -i insertscript.sql


		
Advertisements

Leave a comment

Excessive logging pre-SQL 2012 when changing variable length columns from NULL to NOT NULL

In pre-SQL Server 2012 versions you may encounter excessive transaction logging when you use the ALTER TABLE ALTER COLUMN command to change NULL to NOT NULL. Depending on the record size the amount of transaction log space used could be over three times the size of the table you are altering. This post shows that this is specifically an issue when changing variable length columns from NULL to NOT NULL. The issue does not occur for the same change applied to fixed length columns.

The basis of this post is this StackExchange thread: http://dba.stackexchange.com/questions/29522/why-does-alter-column-to-not-null-cause-massive-log-file-growth

The thread pretty much gives the explanation to the problem but the point about fixed versus variable length is not entirely clear. Here’s a test script which explains the differences:


IF OBJECT_ID('dbo.testalternotnull') IS NOT NULL DROP TABLE dbo.testalternotnull;

CREATE TABLE dbo.testalternotnull
(
id int NOT NULL,
col varchar(3) NULL
);
GO

INSERT INTO dbo.testalternotnull VALUES (1,'abc')
GO 10

CHECKPOINT;

ALTER TABLE dbo.testalternotnull ALTER COLUMN col varchar(3) NOT NULL;

SELECT * FROM fn_dblog(null,null);

You can see the ten LOP_MODIFY_ROW entries where the AllocUnitName is dbo.testalternotnull

altertonotnull1

Also note the log reserve figures.

Now let’s check the case where we are changing from NOT NULL to NULL:


IF OBJECT_ID('dbo.testalternotnull') IS NOT NULL DROP TABLE dbo.testalternotnull;

CREATE TABLE dbo.testalternotnull
(
id int NOT NULL,
col varchar(3) NOT NULL
);
GO

INSERT INTO dbo.testalternotnull VALUES (1,'abc')
GO 10

CHECKPOINT;

ALTER TABLE dbo.testalternotnull ALTER COLUMN col varchar(3) NULL;

SELECT * FROM fn_dblog(null,null);

altertonotnull2

No reference to the table name here so this is just a metadata change.

Now let’s check the results for a fixed length column changing NULL to NOT NULL:


IF OBJECT_ID('dbo.testalternotnull') IS NOT NULL DROP TABLE dbo.testalternotnull;

CREATE TABLE dbo.testalternotnull
(
id int NOT NULL,
col char(3) NULL
);
GO

INSERT INTO dbo.testalternotnull VALUES (1,'abc')
GO 10

CHECKPOINT;

ALTER TABLE dbo.testalternotnull ALTER COLUMN col char(3) NOT NULL;

SELECT * FROM fn_dblog(null,null);

altertonotnull3

No references to LOP_MODIFY_ROW where AllocUnitName is dbo.testalternotnull so for fixed length columns this is a metadata only change.

I won’t show the output for changing a CHAR(3) NOT NULL to NULL as the result is the same as that for VARCHAR(3) NOT NULL to NULL. That is, it is a metadata only change.

What about BIT columns ? Well given they are represented in the record layout as a byte packed structure they can be considered to be a variable type. Let’s check this by showing the output for changing a BIT NULL to a BIT NOT NULL:


IF OBJECT_ID('dbo.testalternotnull') IS NOT NULL DROP TABLE dbo.testalternotnull;

CREATE TABLE dbo.testalternotnull
(
id int NOT NULL,
col BIT NULL
);
GO

INSERT INTO dbo.testalternotnull VALUES (1,0)
GO 10

CHECKPOINT;

ALTER TABLE dbo.testalternotnull ALTER COLUMN col BIT NOT NULL;

SELECT * FROM fn_dblog(null,null);

altertonotnull4

So, yes, you can consider the BIT type to be a variable type and changing it from NULL to NOT NULL can result in high transaction log use.

In summary, this post shows examples of high, and what you may consider excessive, transaction log use when changing variable length columns from NULL to NOT NULL. The StackExchange thread referred to at the start of this post shows that the resulting size of the transaction log may be many times the size of the table being altered. Fortunately, for SQL Server 2012 and above most column changes appear to be metadata only changes.

Leave a comment

Delayed durability: performance boost for certain workloads

If some of your workloads are able to tolerate a certain amount of data loss and they are already bottlenecked on the write performance of the transaction log then SQL Server 2014’s delayed durability option will probably give you immediate performance boost. One use case that comes to mind is staging databases where ETL work is being performed. Some of the data feeds may be implemented as single row inserts as opposed to batch inserts. Within the application environment you may have no options available to convert to batch inserts (ie multiple inserts in one transaction). This post will show the performance improvements you could see by enabling delayed durability.

Single row inserts going into this table:


CREATE TABLE dbo.TestingData
(
id INT NOT NULL IDENTITY,
col1 INT NOT NULL DEFAULT 2000,
col2 VARCHAR(50) NOT NULL DEFAULT 'wxyz'
)

Test script inserting 10000 rows:


SET NOCOUNT ON
DECLARE @i INT = 1
WHILE @i <= 10000
BEGIN
INSERT INTO dbo.TestingData( col1, col2 ) VALUES (DEFAULT, DEFAULT)
SET @i = @i + 1
END

Toggle the delayed durability option using this:


ALTER DATABASE [DelayedDurabilityTest] SET DELAYED_DURABILITY = <FORCED or DISABLED> WITH NO_WAIT

Results inserting 10000 rows (captured using before and after sampling of sys.dm_io_virtual_file_stats):

             sampletime    file_id    reads    readbytes    bytesperread    writes    writebytes    bytesperwrite
2014-04-21 10:22:02.967    1          0        0            0               0         0             0
2014-04-21 10:22:02.967    2          0        0            0               10000     5154816       515
2014-04-21 11:00:01.947    1          0        0            0               0         0             0
2014-04-21 11:00:01.947    2          0        0            0               1733      4046336       2334

The test at 10:22 was conducted with delayed durability disabled (the default). File id 1 is the data file, file id 2 is the log file. The test at 11:00 was conducted with delayed durability set to forced. You can see the reduced number of writes to the transaction log and the higher figure for bytes per write. The sysinternals ProcessMonitor utility confirms these details. For the 10:22 test we see this:

delayeddurability1

The Sequence column starts at 0 so this is showing us 10000 writes each being 512 bytes (matches the results from sys.dm_io_virtual_file_stats). For the 11:00 test we see this:

delayeddurability2

The Sequence column started at 10000, it ends at 11732 so that’s 1733 write IOs matching the figure from the DMV. The Detail column shows the larger write sizes.

Following this simple test I changed 10000 to 500000 in the while loop that does the insert and captured the performance monitor counters for transactions/sec, log flushes and %processor time. Here are the results:

delayeddurability3

The execution took over five minutes with the option disabled and forty seconds when it was set to forced. You can see in black the improvement in transactions/sec

The actual transaction log content is identical for both executions (not surprising as the option simply controls when the log buffer is written to disk, not its content). Here are the results for two transaction entries using fn_dblog:

delayeddurability4

 

Delayed durability is available in all editions of SQL Server 2014. For workloads that can tolerate a certain amount of data loss you’ll probably notice an improvement in your transaction processing rate.

Leave a comment

Mining transaction log backups

It’s commonly known that the sys.fn_dblog function can be used to query the contents of the transaction log. Frequently run data changes can be identified using this function. If you are taking transaction log backups then the record of the committed changes will be written to the transaction log backup and the log itself will be cleared. If you are running scheduled transaction log backups it’s possible to mine these backups using the sys.fn_dump_dblog function. An example of a statistic that may be useful is the number of page splits that are occurring for a given clustered index.

The following script creates a test table, inserts some test data in clustered primary key order and finally inserts another row into the middle of the existing range:


CREATE TABLE dbo.testsplits (
id int NOT NULL CONSTRAINT pk_testsplits PRIMARY KEY,
val char(500) NOT NULL
)

BACKUP DATABASE TestSplits TO DISK = 'c:\sqlback\TestSplits.bak' WITH INIT

TRUNCATE TABLE dbo.testsplits

CHECKPOINT

BACKUP LOG TestSplits TO DISK = 'c:\sqlback\TestSplits.trn' WITH INIT

INSERT INTO dbo.testsplits VALUES (1,'testing')
INSERT INTO dbo.testsplits VALUES (2,'testing')
INSERT INTO dbo.testsplits VALUES (3,'testing')
INSERT INTO dbo.testsplits VALUES (4,'testing')
INSERT INTO dbo.testsplits VALUES (5,'testing')
INSERT INTO dbo.testsplits VALUES (6,'testing')
INSERT INTO dbo.testsplits VALUES (7,'testing')
INSERT INTO dbo.testsplits VALUES (8,'testing')
INSERT INTO dbo.testsplits VALUES (9,'testing')
INSERT INTO dbo.testsplits VALUES (10,'testing')
-- INSERT INTO dbo.testsplits VALUES (11,'testing')
INSERT INTO dbo.testsplits VALUES (12,'testing')
INSERT INTO dbo.testsplits VALUES (13,'testing')
INSERT INTO dbo.testsplits VALUES (14,'testing')
INSERT INTO dbo.testsplits VALUES (15,'testing')
INSERT INTO dbo.testsplits VALUES (16,'testing')

INSERT INTO dbo.testsplits VALUES (11,'testing')

CHECKPOINT

You can use the sys.fn_dblog function to query the log for inserts and page splits for the testspits table:


SELECT
[Current LSN],[Operation],[Context],[Description], [AllocUnitId],[AllocUnitName],
[Page ID],[Slot ID], [New Split Page],[Rows Deleted],[Bytes Freed]
FROM sys.fn_dblog(null,null)
WHERE [Operation] IN ('LOP_INSERT_ROWS','LOP_DELETE_SPLIT')
AND [Context] = 'LCX_CLUSTERED'
AND [AllocUnitName] = 'dbo.testsplits.pk_testsplits'
ORDER BY [Current LSN]

testsplits1

Backup the transaction log and you can now query it for the same information using the sys.fn_dump_dblog function:


BACKUP LOG TestSplits TO DISK = 'c:\sqlback\TestSplits.trn' WITH INIT

DECLARE @allocunitid bigint

SELECT @allocunitid = allocation_unit_id
FROM sys.allocation_units au
INNER JOIN sys.partitions p
ON au.container_id =
(
CASE WHEN au.type IN (1,3) THEN p.hobt_id
WHEN au.type = 2 THEN p.partition_id
END
)
WHERE p.object_id = OBJECT_ID('testsplits')
AND p.index_id = 1

SELECT
[Current LSN],[Operation],[Context],[Description], [AllocUnitId],[AllocUnitName],
[Page ID],[Slot ID], [New Split Page],[Rows Deleted],[Bytes Freed]
FROM sys.fn_dump_dblog(default,default,default,default,
'c:\sqlback\TestSplits.trn',default,default,default,default,default,default,default,
default,default,default,default,default,default,default,default,
default,default,default,default,default,default,default,default,
default,default,default,default,default,default,default,default,
default,default,default,default,default,default,default,default,
default,default,default,default,default,default,default,default,
default,default,default,default,default,default,default,default,
default,default,default,default,default,default,default,default
)
WHERE [Operation] IN ('LOP_INSERT_ROWS','LOP_DELETE_SPLIT')
AND [Context] = 'LCX_CLUSTERED'
AND [AllocUnitId] = @allocunitid
ORDER BY [Current LSN]

testsplits2

You can see that the allocation unit name is not persisted in the transaction log backup so we have to use the allocation unit id to search for the specific clustered index.

You don’t have to run this command on your production server. You can copy the transaction log backups to another SQL Server and query them there.

This is just one example of mining transaction log backups for useful information.

Leave a comment