-
Notifications
You must be signed in to change notification settings - Fork 0
SPDK Notes
Setting up SPDK is doable. Do:
- clone the SPDK repo
- checkout to a stable version
- follow https://spdk.io/doc/getting_started.html. It boils down to calling a script that installs all dependencies, a configure script and a make script in order.
Afterwards it is supposed to work. You can test by calling a default program that is built with SPDK.
For example ./spdk/build/examples/nvme/identify
(path might have changed, but look for an identify binary).
This should list all available NVMe devices and some general stats similar to NVMe-Cli.
There is one major challenge with using SPDK. The device you want to use, should be used in user-space. This means the kernel must unbind it. SPDK can do this manually, but by default it unbinds ALL NVMe devices. This is generally not what you want. Therefore, explicitly specify what devices you want to use. This is not done with names like "nvme0", but with ids of the device. Remember, we are moving away from Linux, naming as well. The trid of a device can easily be retrieved with:
trid=`ls -l /sys/block/$dev/device/device | awk '{split($11,dev,"/"); print dev[4]}'`
$dev should be the device name, for example, nvme0. Then you can bind with:
export PCI_ALLOWED=$trid
./spdk/scripts/setup.sh
The devices can be unbinded with:
./spdk/scripts/setup.sh reset
Fio is an essential tool for benchmarking I/O. Fio does not come with SPDK by default and requires a plugin with SPDK itself. We have seen machines crash that use the plugin, therefore, always run within a VM. To install fio with SPDK a number of step are needed:
- Be sure that SPDK itself is cloned recursively. There should be a DPDK directory within SPDK.
- Clone fio anywhere.
- cd into fio directory and call
/.configure
- call
make -j $nprocs
in fio directory - (optional) install globally. Only do this when you want to use this fio version anywhere. Do this with
sudo make install
. - cd back to SPDK dir
- Call
/configure
like usual, but now with an extra flag:--with-fio=<absolute path to earlier installed fio repo>
. Do not use a relative path. This can break. - Call
make -j $nprocs
.
Now verify if it works. It is not enough to simply call fio to use the plugin. You need to preload it. This can be done with:
sudo LD_LIBRARY_PATH=<SPDK_DIR>/build/lib LD_PRELOAD=<SPDK_DIR>/build/fio/spdk_nvme fio
There should now be a SPDK
in your listed storage engines. Try --enghelp=spdk
as arg to fio.
Running fio with SPDK takes some getting used to. It has some issues with linking, paths and rarely a segfault or two.
Just like we did with the install do LD_LIBRARY_PATH=<SPDK_DIR>/build/lib LD_PRELOAD=<SPDK_DIR>/build/fio/spdk_nvme
before fio
and run as root. There are a few other things that you need to know:
- Always explicitly set the storage engine with
ioengine=spdk
- With SPDK we refer to the device with a trid (e.g 9:00:00.0). This is not valid in fio and results in parsing errors. The trid needs to be translated to use "." instead of ":". If you store trid in variable "trid", you can convert it to the valid format with
triddot=$(echo $trid | sed 's/\:/./g')
. - Format trid + namespace like
filename=trtype=PCIe traddr=0000.00.04.0 ns=2
. This works terrible on the command line because of the spaces in the filename. Try to use ".fio" files were possible to load the benchmark instead. - When using ZNS-like namespaces, you need to convert the namespace-id. Fio expects 1-indexed namespaced not 0-indexed like other systems. So if you want to use namespace 0 do
ns=1
and if you want to use namespace 1 dons=2
. - SPDK does not support threads at the level of processes. therefore it is absolutely required to set flag
thread=1
. There is no error handling to protect you. - Use ZNS with
zonemode=zbd
. Within ZNS there is a bunch of idiosyncracies:- To use appends for I/O, use
append=1
as well. - io_depth can not be higher than 1 when NOT using appends. Only set io_depth when using append.
- Preferably use
initial_zone_reset=1
. This resets the zones before doing the benchmarks. -
max_open_zones=...
use to limit number of zones that can be opened by fio. Never said higher than the real max_open_zones. - use
offset_increment=...z
to force striping. Each additional writer will then write to a different selection of zones.
- To use appends for I/O, use
- Increase writers with
numjobs
. - Be careful with setting scheduling. Setting
arbitration_burst
higher than is allowed will freeze the system. The device is essentially stuck. Unless needed, do not useenable_wrr
andarbitration_burst
.
For some examples of working SPDK fio tests for ZNS look at the SPDK
directory in https://github.com/Krien/ZNS_SPDK_Benchmarks. Specifically the fio files should be relevant. A simple write example could for example be:
# Inside of example.fio
[global]
ioengine=spdk
thread=1
group_reporting=1
direct=1
time_based=1
ramp_time=5
runtime=60
size=128z
rw=write
iodepth=1
zonemode=zbd
max_open_zones=14
initial_zone_reset=1
filename=trtype=PCIe traddr=0000.00.04.0 ns=2
[1z1t]
stonewall
numjobs=1
[2z2t]
stonewall
offset_increment=128z
numjobs=2
[3z3t]
stonewall
offset_increment=128z
numjobs=3
[4z4t]
stonewall
offset_increment=128z
numjobs=4
This test will stripe data using writes on a ZNS SSD with various levels of concurrency. Notice how stonewall
is used to separate between benchmarks and how we use ramptime.