1. Overview

In this tutorial, we’ll learn about the important metrics when considering disk performance. Additionally, we’ll look at some command-line tools in Linux that allow us to benchmark a disk device.

2. Disk Performance

Testing the performance of a disk is crucial for understanding its capabilities. Furthermore, these insights can help us understand the resources bottleneck in our system and thereby allow us to optimize our workload more effectively.

When it comes to the performance of a disk, there’re several key metrics that are important to consider. Firstly, the sequential read speed measures how fast the data can be read from the disk in a sequential manner. Concretely, this metric measures how fast can a workload read a large file that’s stored in a contiguous block of disk space. Similarly, the sequential write speed measures the rate at which the data can be written to the disk.

Then, there’s the random access performance that measures how fast can the disk performs small and random read/write requests. A random read means that the read requests do not request data from the same block, resulting in more overhead for spinning-based hard disk drives. Similarly, a random write reduces the throughput as the drive has to write the data onto different regions of the disk. This metric is distinct from the read and writes speed because disk tends to do worse on random access than sequential access.

In the following sections, we’ll look at several methods for testing the different metrics of a disk we’ve discussed. For the test executions in the sections below, we’ll run it on the /tmp directory as it’s mounted on the main disk device, a solid-state drive (SSD). If we would like to test other devices’ performance, we can mount another directory onto the disk device and run the test on that directory instead.

3. dd

The dd utility in Linux offers functionality to measure both read and write speed. The advantage of using the dd command for benchmarking is that it’s straightforward to use with easy-to-interpret result output. The downside, however, is that it can only measure the sequential read and write speed and lacks support for random access metrics.

The dd utility tool is part of the coreutils package and hence, should be available by default in most of the Linux distro.

3.1. Testing Sequential Write Speed

To test the write speed, we can run the dd command and set the input file to /dev/zero. Then, we write the stream of zeros from the /dev/zero onto the /tmp/tempfile:

$ dd if=/dev/zero of=/tmp/tempfile bs=1M count=1024 conv=fdatasync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.42868 s, 442 MB/s

For the test, we’ve set the block size to 1 megabyte, and we’ll be writing 1024 counts of such blocks. Additionally, we pass the conv=fdatasync options to the command to ensure the disk is written to the disk physically, and not in the buffer at completion.

3.2. Testing Sequential Read Speed

Using the same file we’ve written while testing the write speed, we can also test the sequential read speed. However, we’ll need to first clear the buffer cache:

$ sudo sh -c "/usr/bin/echo 3 > /proc/sys/vm/drop_caches"

Clearing the buffer ensures that the read test we’ll be conducting later will read the file from the disk, not the buffer.

Once we’ve cleared the buffer cache, we can run a test of the read speed using the dd command again. For the sequential read speed test, we’ll read from the /tmp/tempfile and write it to the /dev/null pseudo device file:

$ dd if=/tmp/tempfile of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.68137 s, 639 MB/s

As a fun side experiment, we can now rerun the previous command and observe the speed of reading from the buffer cache:

$ dd if=/tmp/tempfile of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.146467 s, 7.3 GB/s

As we can see, the subsequent run of the same command produces a significantly faster read speed at 7.3 gigabytes per second. This is because, after the first run, the content of the file is cached on the memory. Therefore, the read speed of the dd command report is actually the speed to read from the memory and not the disk.

Despite the available-by-default, easy-to-use interface, the lack of random read-write benchmarking on the dd command makes it severely limiting. Let’s look at another tool that we can use for testing the random read-write speed.

4. iozone

The iozone is a great tool that measures disk performance from a broad range of aspects in the form of different file operations. For example, with iozone, we can measure the performance of the disk on file operations like sequential read-write, random read-write,  re-read and re-write, stridden read, and many more. The iozone command supports these different file operations using the -i option.

4.1. Installation

To obtain the iozone binary, we can install the iozone3 package with our package manager:

$ sudo apt-get install -y iozone3

Then, we can verify the installation by checking its version using the -v option:

$ iozone -v
       'Iozone' Filesystem Benchmark Program
 
        Version $Revision: 3.489 $
    Compiled for 64 bit mode.
...

4.2. Performance on Random Read and Random Write

To run random read-and-write tests, we can pass the -i2 option flag to the iozone command:

$ iozone -t1 -i0 -i2 -r1k -s1g /tmp

In addition to the –i2 option flag, we’ve also configured the tests using different option flags. Firstly, the -t1 option set the number of threads for test execution to one. Then, we specify the -i0 to make iozone create the test file for a test. Furthermore, the -r1k and -s1g configure the test to use a block size of 1 kilobyte with a total size of 1 gigabyte. Finally, we set the test path to /tmp.

$ iozone -t1 -i0 -i2 -r1k -s1g /tmp
    Iozone: Performance Test of File I/O
            ...

    Run began: Sat Aug 12 05:44:31 2023

    ...

    Children see throughput for 1 random readers     =  639369.00 kB/sec
    Parent sees throughput for 1 random readers     =  632538.79 kB/sec
    Min throughput per process             =  639369.00 kB/sec 
    Max throughput per process             =  639369.00 kB/sec
    Avg throughput per process             =  639369.00 kB/sec
    Min xfer                     = 1048576.00 kB

    Children see throughput for 1 random writers     =   19674.84 kB/sec
    Parent sees throughput for 1 random writers     =   15381.51 kB/sec
    Min throughput per process             =   19674.84 kB/sec 
    Max throughput per process             =   19674.84 kB/sec
    Avg throughput per process             =   19674.84 kB/sec
    Min xfer                     = 1048576.00 kB

4.3. Interpreting the Result

From the output, we get four different sections of results. Each section shows the details of the different tests we’ve run. The first two sections are showing the sequential write and rewrite speed. That’s because we’ve passed the -i0 option flag to first generate a test file for our random read-write experiment.

Then, we can see the random read speed section with the line “random readers”. From the throughput reading, we see that the speed of random reading on the disk is roughly 639 megabytes per second. That’s very close to the sequential read performance, as we’ve measured using the dd command. The reason for this observation is the fact that the SSD doesn’t suffer the same issue as a spinning HDD when it comes to random reading.

The fourth section with the line “random writers” shows the random write speed on the disk. We can see that for random write, it fares worse than sequential write on the same drive. Because, on an SSD, a write always cost the entire page update, regardless of how little bytes we’re writing per request. Therefore, a random write will have lesser throughput than a sequential write, even on an SSD.

Furthermore, in each of the results, we can also see that there’s a min, max, and average throughput for the measurements. These statistics are only relevant when we run the test with multiple threads. Since we only use a single thread, the statistics show the same reading as there’s only a single data point.

5. Conclusion

In this tutorial, we’ve looked at learned about some key metrics of disk performance and how to measure them. Firstly, we learned that the dd command-line tool provides both sequential read-write speed measurements in the form of copying files. Then, we’ve looked at the more sophisticated disk benchmarking command-line tool, the iozone command. We learned that the iozone can test the performance of a disk with a variety of different file operations, such as random read-write.