Skip to content

SPDK Notes

Krijn Doekemeijer edited this page Aug 23, 2023 · 1 revision

Setting up SPDK

Setting up SPDK is doable. Do:

  1. clone the SPDK repo
  2. checkout to a stable version
  3. follow https://spdk.io/doc/getting_started.html. It boils down to calling a script that installs all dependencies, a configure script and a make script in order.

Afterwards it is supposed to work. You can test by calling a default program that is built with SPDK. For example ./spdk/build/examples/nvme/identify (path might have changed, but look for an identify binary). This should list all available NVMe devices and some general stats similar to NVMe-Cli.

There is one major challenge with using SPDK. The device you want to use, should be used in user-space. This means the kernel must unbind it. SPDK can do this manually, but by default it unbinds ALL NVMe devices. This is generally not what you want. Therefore, explicitly specify what devices you want to use. This is not done with names like "nvme0", but with ids of the device. Remember, we are moving away from Linux, naming as well. The trid of a device can easily be retrieved with:

trid=`ls -l /sys/block/$dev/device/device | awk '{split($11,dev,"/"); print dev[4]}'`

$dev should be the device name, for example, nvme0. Then you can bind with:

export PCI_ALLOWED=$trid
./spdk/scripts/setup.sh

The devices can be unbinded with:

./spdk/scripts/setup.sh reset

Setting up fio for SPDK

Fio is an essential tool for benchmarking I/O. Fio does not come with SPDK by default and requires a plugin with SPDK itself. We have seen machines crash that use the plugin, therefore, always run within a VM. To install fio with SPDK a number of step are needed:

  1. Be sure that SPDK itself is cloned recursively. There should be a DPDK directory within SPDK.
  2. Clone fio anywhere.
  3. cd into fio directory and call /.configure
  4. call make -j $nprocs in fio directory
  5. (optional) install globally. Only do this when you want to use this fio version anywhere. Do this with sudo make install.
  6. cd back to SPDK dir
  7. Call /configure like usual, but now with an extra flag: --with-fio=<absolute path to earlier installed fio repo>. Do not use a relative path. This can break.
  8. Call make -j $nprocs.

Now verify if it works. It is not enough to simply call fio to use the plugin. You need to preload it. This can be done with:

sudo LD_LIBRARY_PATH=<SPDK_DIR>/build/lib LD_PRELOAD=<SPDK_DIR>/build/fio/spdk_nvme fio

There should now be a SPDK in your listed storage engines. Try --enghelp=spdk as arg to fio.

SPDK fio benchmarking

Running fio with SPDK takes some getting used to. It has some issues with linking, paths and rarely a segfault or two. Just like we did with the install do LD_LIBRARY_PATH=<SPDK_DIR>/build/lib LD_PRELOAD=<SPDK_DIR>/build/fio/spdk_nvme before fio and run as root. There are a few other things that you need to know:

  • Always explicitly set the storage engine with ioengine=spdk
  • With SPDK we refer to the device with a trid (e.g 9:00:00.0). This is not valid in fio and results in parsing errors. The trid needs to be translated to use "." instead of ":". If you store trid in variable "trid", you can convert it to the valid format with triddot=$(echo $trid | sed 's/\:/./g').
  • Format trid + namespace like filename=trtype=PCIe traddr=0000.00.04.0 ns=2. This works terrible on the command line because of the spaces in the filename. Try to use ".fio" files were possible to load the benchmark instead.
  • When using ZNS-like namespaces, you need to convert the namespace-id. Fio expects 1-indexed namespaced not 0-indexed like other systems. So if you want to use namespace 0 do ns=1 and if you want to use namespace 1 do ns=2.
  • SPDK does not support threads at the level of processes. therefore it is absolutely required to set flag thread=1. There is no error handling to protect you.
  • Use ZNS with zonemode=zbd. Within ZNS there is a bunch of idiosyncracies:
    • To use appends for I/O, use append=1 as well.
    • io_depth can not be higher than 1 when NOT using appends. Only set io_depth when using append.
    • Preferably use initial_zone_reset=1. This resets the zones before doing the benchmarks.
    • max_open_zones=... use to limit number of zones that can be opened by fio. Never said higher than the real max_open_zones.
    • use offset_increment=...z to force striping. Each additional writer will then write to a different selection of zones.
  • Increase writers with numjobs.
  • Be careful with setting scheduling. Setting arbitration_burst higher than is allowed will freeze the system. The device is essentially stuck. Unless needed, do not use enable_wrr and arbitration_burst.

For some examples of working SPDK fio tests for ZNS look at the SPDK directory in https://github.com/Krien/ZNS_SPDK_Benchmarks. Specifically the fio files should be relevant. A simple write example could for example be:

# Inside of example.fio
[global]
ioengine=spdk
thread=1
group_reporting=1
direct=1
time_based=1
ramp_time=5
runtime=60
size=128z
rw=write
iodepth=1
zonemode=zbd
max_open_zones=14
initial_zone_reset=1
filename=trtype=PCIe traddr=0000.00.04.0 ns=2

[1z1t]
stonewall
numjobs=1

[2z2t]
stonewall
offset_increment=128z
numjobs=2

[3z3t]
stonewall
offset_increment=128z
numjobs=3

[4z4t]
stonewall
offset_increment=128z
numjobs=4

This test will stripe data using writes on a ZNS SSD with various levels of concurrency. Notice how stonewall is used to separate between benchmarks and how we use ramptime.

Clone this wiki locally