Load Testing Your Storage Subsystem with Diskspd – Part III
In our final post in our “Load Testing Your Storage Subsystem with Diskspd” series, we’re going to look at output from Diskspd and run some tests and interpret results. In our first post we showed how performance can vary based on access pattern and IO size. In our second post we showed how to design a test to highlight those performance characteristics and in this post we’ll execute those tests and review the results. First let’s walk through the output from Diskspd, for now don’t focus on the actual results.
There are four major sections:
- Test Parameters – here is the test’s parameters. Including the exact command line parameters executed. This is great for reproducing tests.
Command Line: diskspd.exe -d15 -o1 -F1 -b60K -h -s -L -w100 C:\TEST\iotest.dat Input parameters: timespan: 1 ------------- duration: 15s warm up time: 5s cool down time: 0s measuring latency random seed: 0 path: 'C:\TEST\iotest.dat' think time: 0ms burst size: 0 software and hardware write cache disabled performing write test block size: 61440 number of outstanding I/O operations: 1 thread stride size: 0 IO priority: normal
- CPU Usage – CPU usage for the test, recall if you are not using all your bandwidth, you may want to add threads. If your CPU burn is high, you may want to back off on the number of threads.
Results for timespan 1: ******************************************************************************* actual test time: 15.00s thread count: 1 proc count: 2 CPU | Usage | User | Kernel | Idle ------------------------------------------- 0| 30.10%| 1.04%| 29.06%| 69.89% 1| 0.10%| 0.10%| 0.00%| 99.78% ------------------------------------------- avg.| 15.10%| 0.57%| 14.53%| 84.84%
- Performance – this is the meat of the test. Here we see bandwidth measured in MB/sec and latency measured in microseconds. With SSDs and today’s super fast storage I/O subsystems, you’ll likely need this level of accuracy. This is alone beats SQLIO in my opinion. I’m not much a fan of IOPs since those numbers require that you know the size of the IO for it to have any meaning. Check out Jeremiah Peschka’s article on this here. Remember, focus on minimizing latency and maximizing I/O please refer back Part I and Part II posts in this series for details.
Total IO thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file ----------------------------------------------------------------------------------------------------- 0 | 3162378240 | 51471 | 201.04 | 3431.10 | 0.289 | 2.816 | C:\TEST\iotest.dat (20GB) ----------------------------------------------------------------------------------------------------- total: 3162378240 | 51471 | 201.04 | 3431.10 | 0.289 | 2.816 Read IO thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file ----------------------------------------------------------------------------------------------------- 0 | 0 | 0 | 0.00 | 0.00 | 0.000 | N/A | C:\TEST\iotest.dat (20GB) ----------------------------------------------------------------------------------------------------- total: 0 | 0 | 0.00 | 0.00 | 0.000 | N/A Write IO thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file ----------------------------------------------------------------------------------------------------- 0 | 3162378240 | 51471 | 201.04 | 3431.10 | 0.289 | 2.816 | C:\TEST\iotest.dat (20GB) ----------------------------------------------------------------------------------------------------- total: 3162378240 | 51471 | 201.04 | 3431.10 | 0.289 | 2.816
- Histogram – this gives a great representation of how your test did over the whole run. In this example, 99% of the time our latency was less than 0.654ms…that’s pretty super.
%-ile | Read (ms) | Write (ms) | Total (ms) ---------------------------------------------- min | N/A | 0.059 | 0.059 25th | N/A | 0.163 | 0.163 50th | N/A | 0.193 | 0.193 75th | N/A | 0.218 | 0.218 90th | N/A | 0.258 | 0.258 95th | N/A | 0.312 | 0.312 99th | N/A | 0.654 | 0.654 3-nines | N/A | 17.926 | 17.926 4-nines | N/A | 18.906 | 18.906 5-nines | N/A | 583.568 | 583.568 6-nines | N/A | 583.568 | 583.568 7-nines | N/A | 583.568 | 583.568 8-nines | N/A | 583.568 | 583.568 max | N/A | 583.568 | 583.568Impact of I/O Access Patterns
- Random - diskspd.exe -d15 -o32 -t2 -b64K -h -r -L -w0 C:\TEST\iotest.dat**
Read IO thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file ----------------------------------------------------------------------------------------------------- 0 | 16066543616 | 245156 | 1021.49 | 16343.84 | 1.896 | 0.286 | C:\TEST\iotest.dat (20GB) 1 | 16231759872 | 247677 | 1031.99 | 16511.91 | 1.877 | 0.207 | C:\TEST\iotest.dat (20GB) ----------------------------------------------------------------------------------------------------- total: 32298303488 | 492833 | 2053.48 | 32855.75 | 1.886 | 0.250In this test you can see the that there is high throughput and very low latency. This disk is a PCIe attached SSD, so it performs well with a random IO access pattern.
- Sequential - diskspd.exe -d15 -o32 -t2 -b64K -h -s -L -w0 C:\TEST\iotest.dat
Read IO thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file ----------------------------------------------------------------------------------------------------- 0 | 16094724096 | 245586 | 1022.21 | 16355.35 | 1.895 | 0.260 | C:\TEST\iotest.dat (20GB) 1 | 16263544832 | 248162 | 1032.93 | 16526.91 | 1.875 | 0.185 | C:\TEST\iotest.dat (20GB) ----------------------------------------------------------------------------------------------------- total: 32358268928 | 493748 | 2055.14 | 32882.26 | 1.885 | 0.225In this test you can see that the sequential I/O pattern yields a similar performance profile to the random IO test on the SSD. Recall that an SSD does not have to move a disk head or rotate a platter. The access latency to any location on the drive has the same latency cost.
Impact of I/O sizes
- Tranaction log simulation - diskspd.exe -d15 -o1 -t1 -b60K -h -s -L -w100 C:\TEST\iotest.dat
Write IO thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file ----------------------------------------------------------------------------------------------------- 0 | 3162378240 | 51471 | 201.04 | 3431.10 | 0.289 | 2.816 | C:\TEST\iotest.dat (20GB) ----------------------------------------------------------------------------------------------------- total: 3162378240 | 51471 | 201.04 | 3431.10 | 0.289 | 2.816This test measures access latency of single thread with a very small data transfer, as you can see latency is very low at 0.289. This is expected on a low latency device such as a local attached SSD. * **Backup operation simulation** diskspd.exe -d15 -o32 -t4 -b512K -h -s -L -w0 C:\TEST\iotest.dat
Read IO thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file ----------------------------------------------------------------------------------------------------- 0 | 8552185856 | 16312 | 543.17 | 1086.33 | 29.434 | 26.063 | C:\TEST\iotest.dat (20GB) 1 | 8846311424 | 16873 | 561.85 | 1123.69 | 28.501 | 25.373 | C:\TEST\iotest.dat (20GB) 2 | 8771338240 | 16730 | 557.09 | 1114.17 | 28.777 | 25.582 | C:\TEST\iotest.dat (20GB) 3 | 8876720128 | 16931 | 563.78 | 1127.56 | 28.440 | 25.353 | C:\TEST\iotest.dat (20GB) ----------------------------------------------------------------------------------------------------- total: 35046555648 | 66846 | 2225.88 | 4451.76 | 28.783 | 25.593
And finally, our test simulating reading data for a backup. The larger I/Os have a higher latency but also yield a higher transfer rate at 2,225MB/sec. In this series of post we introduced you into some theory on how drives access data, we presented tests on how to explore the performance profile of your disk subsystem and reviewed Diskspd output for those tests. This should give you the tools and ideas you need to load test your disk subsystem and ensure your SQL Servers will perform well when you put them into production!