Archive for category Benchmarking

OSTRESS doesn’t push SQL Server as hard as you think

Many folks recommend using the RML Utilities OSTRESS.EXE as a benchmarking/stress testing tool. The problem is there seems to be an internal bottleneck limiting the tool to make only about 500 SQL calls per second per thread. The following illustrates the problem.

Create this simple no-op stored procedure in your testing database


CREATE PROCEDURE dbo.noopproc
AS
BEGIN
RETURN
END

Now use OSTRESS to start one thread calling the stored procedure 20,000 times

ostress -Syoursqlserver -dtestingdb -E -Q”dbo.noopproc” -n1 -r20000

Output:

07/29/18 12:11:50.276 [0x00001ED0] I/O Completion manager started
07/29/18 12:11:50.277 [0x00001558] Attempting DOD5015 removal of [C:\Users\xxx\AppData\Local\Temp\output\ostress.log]
07/29/18 12:11:50.280 [0x00001558] OSTRESS, A Generic ODBC-Based Stress/Replay Utility.
Version 9.04.0051 built for x64.
Copyright ⌐ 1997-2014 Microsoft. All Rights Reserved
07/29/18 12:11:50.281 [0x00001558] Computer: xxx
07/29/18 12:11:50.282 [0x00001558] Base Module: C:\Program Files\Microsoft Corporation\RMLUtils\ostress.exe
07/29/18 12:11:50.282 [0x00001558] Process Id: 1400
07/29/18 12:11:50.283 [0x00001558] Active proc mask(0): 0x000000FF
07/29/18 12:11:50.283 [0x00001558] Architecture: 9
07/29/18 12:11:50.284 [0x00001558] Page size: 4096
07/29/18 12:11:50.284 [0x00001558] CPUs: 8
07/29/18 12:11:50.285 [0x00001558] Processor groups: 1
07/29/18 12:11:50.285 [0x00001558] Highest node: 0
07/29/18 12:11:50.286 [0x00001558] Proximity: 00 Node: 00
07/29/18 12:11:50.287 [0x00001558] —————————————
07/29/18 12:11:50.288 [0x00001558] Group: 0
07/29/18 12:11:50.288 [0x00001558] —————————————
07/29/18 12:11:50.288 [0x00001558] Processor(s): 0x00000003 Function units: Shared
07/29/18 12:11:50.289 [0x00001558] Processor(s): 0x0000000C Function units: Shared
07/29/18 12:11:50.290 [0x00001558] Processor(s): 0x00000030 Function units: Shared
07/29/18 12:11:50.290 [0x00001558] Package mask: 0x000000FF
07/29/18 12:11:50.291 [0x00001558] Processor(s): 0x000000C0 Function units: Shared
07/29/18 12:11:50.291 [0x00001558] Processor(s): 0x000000FF assigned to Numa node: 0
07/29/18 12:11:50.292 [0x00001558] Current time bias: -60 minutes -1.00 hours DST Adjusted GMT Daylight Time
07/29/18 12:11:50.292 [0x00001558] WARNING: -o parameter was not supplied; using default output path of <temporary directory>\output [C:\Users\xxx\AppData\Local\Temp\output\]
07/29/18 12:11:50.293 [0x00001558] Max threads setting: 10000
07/29/18 12:11:50.293 [0x00001558] Arguments:
07/29/18 12:11:50.294 [0x00001558] -Sxxx
07/29/18 12:11:50.294 [0x00001558] -dtestingdb
07/29/18 12:11:50.295 [0x00001558] -E
07/29/18 12:11:50.295 [0x00001558] -Qdbo.noopproc
07/29/18 12:11:50.295 [0x00001558] -n1
07/29/18 12:11:50.296 [0x00001558] -r20000
07/29/18 12:11:50.298 [0x00001558] Using language id (LCID): 1024 [English_United States.1252] for character formatting with NLS: 0x0006020F and Defined: 0x0006020F
07/29/18 12:11:50.299 [0x00001558] Default driver: SQL Server Native Client 11.0
07/29/18 12:11:50.299 [0x00001558] Attempting DOD5015 removal of [C:\Users\xxx\AppData\Local\Temp\output\query.out]
07/29/18 12:11:50.302 [0x00001558] Starting query execution…
07/29/18 12:11:50.304 [0x00001558] BETA: Custom CLR Expression support enabled.
07/29/18 12:11:50.305 [0x00001558] Creating 1 thread(s) to process queries
07/29/18 12:11:50.305 [0x00001558] Worker threads created, beginning execution…
07/29/18 12:12:28.335 [0x00001558] Total IO waits: 0, Total IO wait time: 0 (ms)
07/29/18 12:12:28.336 [0x00001558] OSTRESS exiting normally, elapsed time: 00:00:38.037

So 38 seconds to call the no-op stored procedure 20,000 times from one thread using OSTRESS .EXE. That’s not very fast.

Compare to a simple Powershell script making the same 20,000 calls.


$connstring = 'Server=yoursqlserver;Database=testingdb;Trusted_Connection=True'
$connection = New-Object System.Data.SqlClient.SqlConnection $connstring
$connection.Open()
$command = New-Object System.Data.SqlClient.SqlCommand('dbo.noopproc', $connection)
$command.CommandType = [System.Data.CommandType]'StoredProcedure'
$starttime = Get-Date

(1..20000) | ForEach-Object { $ret = $command.ExecuteNonQuery() }

Write-Host "Duration: " (New-TimeSpan -Start $starttime -End (Get-Date)).TotalSeconds

$connection.Close() | Out-Null
$command.Dispose() | Out-Null
$connection.Dispose() | Out-Null

Output:

Duration: 1.570418

So 1.57 seconds to make the same make the same 20,000 stored procedure calls.

Performance monitor showing the difference in batch requests per second

 

This was tested against a SQL 2017 Developer Edition instance patched to Cumulative Update 9.

The recommendation is to not use OSTRESS as your benchmarking/stress testing tool.

 

Advertisements

Leave a comment

SQLIO write buffer type

When doing a search for SQLIO you will find many blog posts showing example executions with the -BH parameter. According to the help for SQLIO the possible values for the -B parameter are H for hardware buffering, S for software buffering, Y for hardware and software buffering and N for no buffering. The use of -BH for the SQLIO read examples may be fine but I would suggest the use of -BH for the write examples is incorrect if you are trying to match SQL Server’s write access method. Using Process Monitor we can identity the write buffering method that SQL Server uses when, for example, we do simple inserts into a table:

sqlio1

You can see the sequential writes to the transaction log made by the log writer process after each commit and the random writes to the data file made by the checkpoint process. In both cases the write type is a Non-cached, Write Through write.

Now checking Process Monitor when the following SQLIO command is run using the -BH parameter

sqlio -kW -s5 -dD -o8 -fsequential -b8 -BH -LS -Fparam.txt

sqlio v1.5.SG
using system counter for latency timings, 2597685 counts per second
parameter file used: param.txt
file d:\testfile.dat with 4 threads (0-3) using mask 0x0 (0)
4 threads writing for 5 secs to file d:\testfile.dat
using 8KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to use hardware disk cache (but not file cache)
using specified size: 100 MB for file: d:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 28807.29
MBs/sec:   225.05
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 1
Max_Latency(ms): 3
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%:  0 99  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

we see this:

sqlio2

Note the missing Write-Through.

Specifying -BN in place of -BH we match the write type that SQL Server is actually using:

sqlio -kW -s5 -dD -o8 -fsequential -b8 -BN -LS -Fparam.txt

sqlio v1.5.SG
using system counter for latency timings, 2597685 counts per second
parameter file used: param.txt
file d:\testfile.dat with 4 threads (0-3) using mask 0x0 (0)
4 threads writing for 5 secs to file d:\testfile.dat
using 8KB sequential IOs
enabling multiple I/Os per thread with 8 outstanding
buffering set to not use file nor disk caches (as is SQL Server)
using specified size: 100 MB for file: d:\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 28794.08
MBs/sec:   224.95
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 0
Max_Latency(ms): 3
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%:  1 99  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

sqlio3

The difference in results for the two parameter types may be quite large if you are doing benchmarks against different types of storage infrastructure. For example, replicated SAN storage.

2 Comments