Archive for July, 2014

Delayed durability data loss demonstration

This MSDN article, http://msdn.microsoft.com/en-gb/library/dn449490.aspx, makes it clear under the section ‘When can I lose data ? SQL Server shutdown and restart’ that a normal shutdown and restart can result in missing data. That is, client applications may receive acknowledgement that a transaction has been committed but because of the deferred log writer activity the transaction log may not have been persisted to the transaction log file. The following scripts  demonstrate the data loss that you could encounter.

First I’ll show that a database with delayed durability disabled does not suffer data loss. My testing database is named DelayedDurabilityTest. I will be inserting data into a table named MyData:

CREATE TABLE DelayedDurabilityTest.dbo.MyData
(
id BIGINT NOT NULL IDENTITY,
val VARCHAR(255) NOT NULL
)

For the first test I set delayed durability to disabled:

ALTER DATABASE [DelayedDurabilityTest] SET DELAYED_DURABILITY = DISABLED WITH NO_WAIT

Now in one query window run this to start populating the table:

-- Insert script - best to run this using SQLCMD - see the very end of this postSET NOCOUNT ON
WHILE 1=1 
BEGIN
INSERT INTO DelayedDurabilityTest.dbo.MyData (val) VALUES ('Hello !!')
PRINT SCOPE_IDENTITY()
END

 

 

While this is running start a new query window and run this:

SHUTDOWN WITH NOWAIT

 

 

Switch back to the window that was running the inserts and note down the last printed identity value. For my example, this was

...16248
16249
16250
16251
16252
Msg 0, Level 11, State 0, Line 0
A severe error occurred on the current command.  The results, if any, should be discarded.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command.  The results, if any, should be discarded.

 

So in my case the last printed identity value was 16252.

Now restart the SQL Server and run

SELECT COUNT(*) FROM DelayedDurabilityTest.dbo.MyData

 

In my case this returned a value of 16253. That fact that it is one higher than the printed value in the window that was doing the inserts simply means the printing of output was lagging the actual inserting of data. The important point is that there is no data loss.

Now we truncate the table, set delayed durability to FORCED and rerun the inserting window and then the shutdown window:


TRUNCATE TABLE DelayedDurabilityTest.dbo.MyData
<pre>ALTER DATABASE [DelayedDurabilityTest] SET DELAYED_DURABILITY = FORCED WITH NO_WAIT</pre>

This time the output of the inserting window was this:


18554
18555
18556
18557
18558
18559
Msg 109, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server. (provider: Shared Memory Provider, error: 0 - The pipe has been ended.)

So the last printed identity value was 18559. As was previously the case there may well be more rows in the table. The slow output of the printed messages may be under representing the actual rows. However, compare this to the row count in the table when the SQL Server was restarted. It was 18512. So in this case it’s actually lower than the last inserted value. So this has shown that with delayed durability enabled a SQL Server shutdown has resulted in data loss.

When running the scripts it may take several attempts to return a case where the row count is lower than the print message for the last row inserted. Using SQLCMD to run the insert script you’ll have more occurrences where the last printed identity is higher than the actual row count in the table when SQL Server is restarted.


SQLCMD -S localhost -i insertscript.sql


			

Leave a comment

Time spent in app, time spent in SQL Server

Application teams will sometimes believe they have a database issue when a recurring feed starts to perform poorly. Typically they will call on the DBA to perform a trace to identify the poorest performing SQL calls. When the trace is run and the analysis completed you may find that there are no obvious SQL issues. All the statements perform well. The performance problem may not necessarily reside on the database tier. It may reside on the application tier. It may be that the application tier is doing a large amount of number crunching or string processing between the recurring SQL calls. It may be the case that these routines have suddenly become the slow step in the feed.

Having performed the SQL trace it would be great if, in addition to the duration of the SQL calls, we could determine the elapsed time between SQL calls. One method for obtaining this figure is by subtracting the start time of an SQL call from the end time of the previous SQL call. If you export the trace data to a table and filter on, for example, the application name, SPID, hostname or proc name that is doing the load and order by start time you can use the LAG function to obtain the elapsed time between SQL calls.

In the following trace code the procedure that is performing the load is named myproc. The last two columns in the final result set show the total time spent doing SQL work and the total time spent in the application between the SQL calls.


WITH ctetracetimes AS
(
SELECT
starttime,
endtime,
LAG(endtime) OVER (ORDER BY starttime) AS previousendtime
FROM dbo.tracetbl
WHERE objectname = 'myproc'
),
cteappandsqltimes AS
(
SELECT
starttime,
endtime,
previousendtime,
DATEDIFF(millisecond,starttime,endtime) AS timespentsql,
DATEDIFF(millisecond,previousendtime,starttime) AS timespentapp
FROM ctetracetimes
)
SELECT *,
SUM(timespentsql) OVER() AS sumtimespentsql,
SUM(timespentapp) OVER() AS sumtimespentapp
FROM cteappandsqltimes
ORDER BY starttime

The results for repeated execution of the myproc stored procedure were as follows:

timespentsqlandapp

You can see that 50,000 milliseconds was spent doing the SQL related work (processing the stored procedure ten times) but over 150,000 milliseconds was spent in the application tier in between the SQL calls.

This type of analysis may be quite enlightening even for what you consider to be your well performing feeds. You may be making the assumption that the vast majority of your feed time is associated with database processing when, in fact, it is associated with application processing time. For example, one of my own feeds had a total elapsed time of twenty minutes. When the above analysis was performed the SQL time was only two minutes. Eighteen minutes was being spent in the application tier. A poorly performing number crunching routine was identified. It was then optimised to reduce the overall run-time of the feed.

Leave a comment