Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can`t output results of FIO test for rook-ceph-block storage #74

Open
vkamlov opened this issue Jun 11, 2021 · 21 comments
Open

Can`t output results of FIO test for rook-ceph-block storage #74

vkamlov opened this issue Jun 11, 2021 · 21 comments
Labels
invalid This doesn't seem right

Comments

@vkamlov
Copy link

vkamlov commented Jun 11, 2021

Info:
Version kubestr - any
Workstation where run kubestr - macos big sur 11.2.3/Debian 10 (buster)
shell - bash/zsh

Scenario:

$ ./kubestr

Kubernetes Version Check:
Valid kubernetes version (v1.16.15) - OK

RBAC Check:
Kubernetes RBAC is enabled - OK

Aggregated Layer Check:
The Kubernetes Aggregated Layer is enabled - OK

Available Storage Provisioners:

rook-ceph.rbd.csi.ceph.com:
Cluster is not CSI snapshot capable. Requires VolumeSnapshotDataSource feature gate.
This is a CSI driver!
(The following info may not be up to date. Please check with the provider for more information.)
Provider: Ceph RBD
Website: https://github.com/ceph/ceph-csi
Description: A Container Storage Interface (CSI) Driver for Ceph RBD
Additional Features: Raw Block, Snapshot, Expansion, Topology, Cloning

Storage Classes:
  * rook-ceph-block

To perform a FIO test, run-
  ./kubestr fio -s <storage class>

This provisioner supports snapshots, however no Volume Snaphsot Classes were found.

$ ./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-xvbxj
Pod created kubestr-fio-pod-vkzb8
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
Elapsed time- 50.430796167s
FIO test results:
Failed while running FIO test.: Unable to parse fio output into json.: unexpected end of JSON input - Error

@bathina2
Copy link
Contributor

Hi @vkamlov,
It's unfortunate that you ran into this issue. Can you help me debug it?
You will have to

  • clone the repo
  • git checkout debug-74
  • go build
  • ./kubestr fio -s rook-ceph-block

I added some debugging that should print the original string. We can figure out why it's failing to parse it.

@vkamlov
Copy link
Author

vkamlov commented Jun 11, 2021

Output is same. There is no debug information in this build

@vkamlov
Copy link
Author

vkamlov commented Jun 11, 2021

Except for empty brackets

./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-wmrwr
Pod created kubestr-fio-pod-mzdft
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
Elapsed time- 50.379957197s
FIO test results:
Failed while running FIO test.: Unable to parse fio output into json. (): unexpected end of JSON input - Error

@bathina2
Copy link
Contributor

Can you try again ? I added additional debugging. It looks like the FIO results are coming up empty. Which is why it failed to parse.

@vkamlov
Copy link
Author

vkamlov commented Jun 11, 2021

./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-ljvhz
Pod created kubestr-fio-pod-z4f8h
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
\ panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x20941c3]

goroutine 54 [running]:
github.com/kastenhq/kubestr/pkg/fio.(*fioStepper).runFIOCommand.func1(0xc00022cba0, 0x23eeb42, 0x7, 0xc000043170, 0x15, 0x23f2990, 0xb, 0xc0000d22d0, 0x5, 0x5, ...)
/tmp/kubestr/pkg/fio/fio.go:315 +0x143
created by github.com/kastenhq/kubestr/pkg/fio.(*fioStepper).runFIOCommand
/tmp/kubestr/pkg/fio/fio.go:313 +0x325

@bathina2
Copy link
Contributor

Heh that made things worse. can we try again?

@vkamlov
Copy link
Author

vkamlov commented Jun 11, 2021

./kubestr fio -s rook-ceph-block
PVC created kubestr-fio-pvc-c59ts
Pod created kubestr-fio-pod-8zngt
Running FIO test (default-fio) on StorageClass (rook-ceph-block) with a PVC of Size (100Gi)
\ stdout-
stderr-
Elapsed time- 50.323057437s
FIO test results:
Failed while running FIO test.: Unable to parse fio output into json. (): unexpected end of JSON input - Error

@vkamlov
Copy link
Author

vkamlov commented Jun 29, 2021

Any updates?

@bathina2
Copy link
Contributor

Hey @vkamlov, sorry for the delay.
Would you be free to jump on a call?

@bathina2
Copy link
Contributor

bathina2 commented Jun 29, 2021

If not can you try deploying the following pod and pvc -

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: kubestr-pvc
spec:
  strorageClassName: rook-ceph-block
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: kubestr-pvc
  containers:
    - name: kubestr
      image: ghcr.io/kastenhq/kubestr:latest
      command: [ "/bin/sh" ]
      args: [ "-c", "tail -f /dev/null" ]
      volumeMounts:
        - mountPath: "/data"
          name: data

After doing that we can run kubectl exec kubestr -i -t -- sh
This will let us enter the pod where we can try running FIO commands. For some reason we are not seeing the output of the commands (they are coming up empty, no error either)

@vkamlov
Copy link
Author

vkamlov commented Jun 30, 2021

I deployed the container. Did some tests what I found in the official fio documentation.
Here is the output I got. Everything looks good.

$ kubectl exec -it task-pv-pod -- sh
/ # 
/ # 
/ # fio --name=global --rw=randread --size=128m --name=job1 --name=job2
job1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
job2: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.20
Starting 2 processes
job1: Laying out IO file (1 file / 128MiB)
job2: Laying out IO file (1 file / 128MiB)
Jobs: 2 (f=2): [r(2)][100.0%][r=20.0MiB/s][r=5367 IOPS][eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=16: Wed Jun 30 10:02:02 2021
  read: IOPS=2095, BW=8383KiB/s (8584kB/s)(128MiB/15635msec)
    clat (usec): min=137, max=30867, avg=439.46, stdev=487.62
     lat (usec): min=138, max=30877, avg=447.87, stdev=488.75
    clat percentiles (usec):
     |  1.00th=[  167],  5.00th=[  196], 10.00th=[  221], 20.00th=[  253],
     | 30.00th=[  281], 40.00th=[  306], 50.00th=[  330], 60.00th=[  363],
     | 70.00th=[  412], 80.00th=[  515], 90.00th=[  734], 95.00th=[ 1004],
     | 99.00th=[ 1909], 99.50th=[ 2606], 99.90th=[ 5407], 99.95th=[ 7963],
     | 99.99th=[20055]
   bw (  KiB/s): min= 6027, max=12167, per=50.74%, avg=8410.81, stdev=1609.35, samples=31
   iops        : min= 1506, max= 3041, avg=2102.42, stdev=402.32, samples=31
  lat (usec)   : 250=18.79%, 500=60.23%, 750=11.42%, 1000=4.55%
  lat (msec)   : 2=4.09%, 4=0.73%, 10=0.17%, 20=0.01%, 50=0.01%
  cpu          : usr=6.06%, sys=22.69%, ctx=33198, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
job2: (groupid=0, jobs=1): err= 0: pid=17: Wed Jun 30 10:02:02 2021
  read: IOPS=2072, BW=8289KiB/s (8488kB/s)(128MiB/15813msec)
    clat (usec): min=133, max=60526, avg=444.96, stdev=599.91
     lat (usec): min=134, max=60530, avg=453.34, stdev=600.95
    clat percentiles (usec):
     |  1.00th=[  167],  5.00th=[  198], 10.00th=[  221], 20.00th=[  255],
     | 30.00th=[  281], 40.00th=[  306], 50.00th=[  334], 60.00th=[  367],
     | 70.00th=[  416], 80.00th=[  519], 90.00th=[  742], 95.00th=[ 1012],
     | 99.00th=[ 1893], 99.50th=[ 2638], 99.90th=[ 5669], 99.95th=[ 8455],
     | 99.99th=[20317]
   bw (  KiB/s): min= 6083, max=12000, per=50.31%, avg=8339.29, stdev=1634.54, samples=31
   iops        : min= 1520, max= 3000, avg=2084.48, stdev=408.59, samples=31
  lat (usec)   : 250=18.48%, 500=60.25%, 750=11.43%, 1000=4.68%
  lat (msec)   : 2=4.26%, 4=0.71%, 10=0.16%, 20=0.02%, 50=0.01%
  lat (msec)   : 100=0.01%
  cpu          : usr=5.70%, sys=22.72%, ctx=33196, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=16.2MiB/s (16.0MB/s), 8289KiB/s-8383KiB/s (8488kB/s-8584kB/s), io=256MiB (268MB), run=15635-15813msec
/ # 
/ # 
/ # fio --name=random-writers --ioengine=libaio --iodepth=4 --rw=randwrite --bs=32k --direct=0 --size=64m --numjobs=4
random-writers: (g=0): rw=randwrite, bs=(R) 32.0KiB-32.0KiB, (W) 32.0KiB-32.0KiB, (T) 32.0KiB-32.0KiB, ioengine=libaio, iodepth=4
...
fio-3.20
Starting 4 processes
random-writers: Laying out IO file (1 file / 64MiB)
random-writers: Laying out IO file (1 file / 64MiB)
random-writers: Laying out IO file (1 file / 64MiB)
random-writers: Laying out IO file (1 file / 64MiB)

random-writers: (groupid=0, jobs=1): err= 0: pid=20: Wed Jun 30 10:02:21 2021
  write: IOPS=8031, BW=251MiB/s (263MB/s)(64.0MiB/255msec); 0 zone resets
    slat (usec): min=36, max=238, avg=77.43, stdev=26.44
    clat (usec): min=18, max=1042, avg=385.44, stdev=112.34
     lat (usec): min=78, max=1186, avg=470.35, stdev=134.32
    clat percentiles (usec):
     |  1.00th=[  212],  5.00th=[  221], 10.00th=[  225], 20.00th=[  245],
     | 30.00th=[  338], 40.00th=[  371], 50.00th=[  412], 60.00th=[  424],
     | 70.00th=[  437], 80.00th=[  457], 90.00th=[  515], 95.00th=[  570],
     | 99.00th=[  660], 99.50th=[  717], 99.90th=[  906], 99.95th=[  988],
     | 99.99th=[ 1045]
  lat (usec)   : 20=0.05%, 250=20.46%, 500=67.24%, 750=11.91%, 1000=0.29%
  lat (msec)   : 2=0.05%
  cpu          : usr=27.56%, sys=70.47%, ctx=8, majf=0, minf=11
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4
random-writers: (groupid=0, jobs=1): err= 0: pid=21: Wed Jun 30 10:02:21 2021
  write: IOPS=7876, BW=246MiB/s (258MB/s)(64.0MiB/260msec); 0 zone resets
    slat (usec): min=33, max=489, avg=77.95, stdev=32.08
    clat (usec): min=23, max=1769, avg=394.50, stdev=151.30
     lat (usec): min=92, max=1889, avg=480.46, stdev=177.47
    clat percentiles (usec):
     |  1.00th=[  204],  5.00th=[  210], 10.00th=[  215], 20.00th=[  227],
     | 30.00th=[  285], 40.00th=[  400], 50.00th=[  412], 60.00th=[  420],
     | 70.00th=[  433], 80.00th=[  486], 90.00th=[  570], 95.00th=[  635],
     | 99.00th=[  742], 99.50th=[  832], 99.90th=[ 1680], 99.95th=[ 1713],
     | 99.99th=[ 1778]
  lat (usec)   : 50=0.05%, 250=24.71%, 500=56.69%, 750=17.63%, 1000=0.49%
  lat (msec)   : 2=0.44%
  cpu          : usr=41.31%, sys=57.14%, ctx=9, majf=0, minf=13
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4
random-writers: (groupid=0, jobs=1): err= 0: pid=22: Wed Jun 30 10:02:21 2021
  write: IOPS=7846, BW=245MiB/s (257MB/s)(64.0MiB/261msec); 0 zone resets
    slat (usec): min=32, max=637, avg=79.19, stdev=36.53
    clat (usec): min=17, max=2185, avg=394.94, stdev=151.92
     lat (usec): min=63, max=2368, avg=481.59, stdev=178.84
    clat percentiles (usec):
     |  1.00th=[  206],  5.00th=[  212], 10.00th=[  217], 20.00th=[  225],
     | 30.00th=[  293], 40.00th=[  408], 50.00th=[  416], 60.00th=[  424],
     | 70.00th=[  437], 80.00th=[  469], 90.00th=[  562], 95.00th=[  652],
     | 99.00th=[  783], 99.50th=[  832], 99.90th=[ 2057], 99.95th=[ 2114],
     | 99.99th=[ 2180]
  lat (usec)   : 20=0.05%, 100=0.05%, 250=27.05%, 500=57.91%, 750=13.53%
  lat (usec)   : 1000=1.27%
  lat (msec)   : 4=0.15%
  cpu          : usr=27.31%, sys=71.15%, ctx=25, majf=0, minf=13
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4
random-writers: (groupid=0, jobs=1): err= 0: pid=23: Wed Jun 30 10:02:21 2021
  write: IOPS=8000, BW=250MiB/s (262MB/s)(64.0MiB/256msec); 0 zone resets
    slat (usec): min=32, max=505, avg=77.38, stdev=31.58
    clat (usec): min=17, max=975, avg=386.85, stdev=128.47
     lat (usec): min=70, max=1132, avg=471.73, stdev=155.17
    clat percentiles (usec):
     |  1.00th=[  204],  5.00th=[  212], 10.00th=[  217], 20.00th=[  225],
     | 30.00th=[  330], 40.00th=[  396], 50.00th=[  412], 60.00th=[  420],
     | 70.00th=[  429], 80.00th=[  461], 90.00th=[  553], 95.00th=[  619],
     | 99.00th=[  717], 99.50th=[  766], 99.90th=[  873], 99.95th=[  922],
     | 99.99th=[  979]
  lat (usec)   : 20=0.05%, 100=0.05%, 250=24.61%, 500=59.72%, 750=14.89%
  lat (usec)   : 1000=0.68%
  cpu          : usr=29.02%, sys=70.59%, ctx=7, majf=0, minf=13
  IO depths    : 1=0.1%, 2=0.1%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2048,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=4

Run status group 0 (all jobs):
  WRITE: bw=981MiB/s (1028MB/s), 245MiB/s-251MiB/s (257MB/s-263MB/s), io=256MiB (268MB), run=255-261msec

@bathina2
Copy link
Contributor

So you have to run it against the directory
fio --directory /data .....

@vkamlov
Copy link
Author

vkamlov commented Jun 30, 2021

/ # fio --directory /data --name=global --rw=randread --size=128m --name=job1 --name=job2
job1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
job2: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.20
Starting 2 processes
job1: Laying out IO file (1 file / 128MiB)
job2: Laying out IO file (1 file / 128MiB)
Jobs: 1 (f=1): [r(1),_(1)][100.0%][r=652KiB/s][r=163 IOPS][eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=15: Wed Jun 30 22:19:46 2021
  read: IOPS=64, BW=259KiB/s (266kB/s)(128MiB/505352msec)
    clat (usec): min=1536, max=424419, avg=15373.01, stdev=18983.77
     lat (usec): min=1540, max=424434, avg=15383.14, stdev=18984.11
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    4], 20.00th=[    4],
     | 30.00th=[    5], 40.00th=[    5], 50.00th=[    6], 60.00th=[   10],
     | 70.00th=[   18], 80.00th=[   26], 90.00th=[   39], 95.00th=[   55],
     | 99.00th=[   85], 99.50th=[   96], 99.90th=[  157], 99.95th=[  197],
     | 99.99th=[  259]
   bw (  KiB/s): min=   32, max= 1016, per=49.85%, avg=258.23, stdev=153.63, samples=983
   iops        : min=    8, max=  254, avg=64.38, stdev=38.42, samples=983
  lat (msec)   : 2=0.27%, 4=22.05%, 10=38.07%, 20=13.59%, 50=19.86%
  lat (msec)   : 100=5.75%, 250=0.40%, 500=0.02%
  cpu          : usr=0.28%, sys=0.85%, ctx=33292, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
job2: (groupid=0, jobs=1): err= 0: pid=16: Wed Jun 30 22:19:46 2021
  read: IOPS=74, BW=300KiB/s (307kB/s)(128MiB/437554msec)
    clat (usec): min=1412, max=409529, avg=13305.23, stdev=17784.49
     lat (usec): min=1416, max=409542, avg=13315.49, stdev=17784.84
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    4], 20.00th=[    4],
     | 30.00th=[    5], 40.00th=[    5], 50.00th=[    6], 60.00th=[    7],
     | 70.00th=[   12], 80.00th=[   22], 90.00th=[   36], 95.00th=[   52],
     | 99.00th=[   80], 99.50th=[   93], 99.90th=[  140], 99.95th=[  169],
     | 99.99th=[  268]
   bw (  KiB/s): min=   40, max= 1085, per=57.82%, avg=299.49, stdev=187.15, samples=873
   iops        : min=   10, max=  271, avg=74.71, stdev=46.79, samples=873
  lat (msec)   : 2=0.53%, 4=26.84%, 10=41.32%, 20=10.21%, 50=15.90%
  lat (msec)   : 100=4.86%, 250=0.32%, 500=0.02%
  cpu          : usr=0.32%, sys=0.95%, ctx=33219, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=32768,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=519KiB/s (531kB/s), 259KiB/s-300KiB/s (266kB/s-307kB/s), io=256MiB (268MB), run=437554-505352msec

Disk stats (read/write):
  rbd3: ios=65563/448, merge=0/25, ticks=927008/10082, in_queue=935584, util=99.62%

@vkamlov
Copy link
Author

vkamlov commented Jun 30, 2021

I also run all test from source file fio_jobs.go. Results looks good like this last test:

/ # fio --directory /data --name=rand_readwrite --bs=4K --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 --time_based --ramp_time=2s --runtime=15s
rand_readwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=64
fio-3.20
Starting 1 process
Jobs: 1 (f=1): [m(1)][0.0%][eta 30d:08h:10m:20s]                           
rand_readwrite: (groupid=0, jobs=1): err= 0: pid=62: Wed Jun 30 22:42:34 2021
  read: IOPS=0, BW=1264B/s (1264B/s)(24.0KiB/19430msec)
    clat (nsec): min=27778, max=59981, avg=44527.67, stdev=15340.15
     lat (nsec): min=35095, max=69363, avg=52516.33, stdev=15882.41
    clat percentiles (nsec):
     |  1.00th=[27776],  5.00th=[27776], 10.00th=[27776], 20.00th=[30848],
     | 30.00th=[30848], 40.00th=[33536], 50.00th=[33536], 60.00th=[57088],
     | 70.00th=[58112], 80.00th=[58112], 90.00th=[60160], 95.00th=[60160],
     | 99.00th=[60160], 99.50th=[60160], 99.90th=[60160], 99.95th=[60160],
     | 99.99th=[60160]
   bw (  KiB/s): min=    7, max=   24, per=100.00%, avg=15.67, stdev= 8.50, samples=3
   iops        : min=    1, max=    6, avg= 3.67, stdev= 2.52, samples=3
  write: IOPS=0, BW=632B/s (632B/s)(12.0KiB/19430msec); 0 zone resets
    clat (msec): min=1430, max=14528, avg=6475.82, stdev=7048.14
     lat (msec): min=1430, max=14528, avg=6475.82, stdev=7048.14
    clat percentiles (msec):
     |  1.00th=[ 1435],  5.00th=[ 1435], 10.00th=[ 1435], 20.00th=[ 1435],
     | 30.00th=[ 1435], 40.00th=[ 3473], 50.00th=[ 3473], 60.00th=[ 3473],
     | 70.00th=[14563], 80.00th=[14563], 90.00th=[14563], 95.00th=[14563],
     | 99.00th=[14563], 99.50th=[14563], 99.90th=[14563], 99.95th=[14563],
     | 99.99th=[14563]
   bw (  KiB/s): min=    7, max=    8, per=100.00%, avg= 7.50, stdev= 0.71, samples=2
   iops        : min=    1, max=    2, avg= 1.50, stdev= 0.71, samples=2
  lat (usec)   : 50=33.33%, 100=33.33%
  lat (msec)   : 2000=11.11%, >=2000=22.22%
  cpu          : usr=0.00%, sys=0.04%, ctx=106, majf=0, minf=58
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=6,3,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=1264B/s (1264B/s), 1264B/s-1264B/s (1264B/s-1264B/s), io=24.0KiB (24.6kB), run=19430-19430msec
  WRITE: bw=632B/s (632B/s), 632B/s-632B/s (632B/s-632B/s), io=12.0KiB (12.3kB), run=19430-19430msec

Disk stats (read/write):
  rbd3: ios=0/2254, merge=0/26, ticks=0/52128, in_queue=52648, util=99.74%

Only this string looks strange, but I dont now if it matters
Jobs: 1 (f=1): [m(1)][0.0%][eta 30d:08h:10m:20s]

@bathina2
Copy link
Contributor

bathina2 commented Jul 1, 2021

Hmm, the last thing thats missing is the --output-format=json option. Otherwise this issue is very bizarre. The command the runs the FIO is stdout, stderr, err = s.kubeExecutor.exec(namespace, podName, containerName, command). stdout, stderr and err are all coming up empty.
Have you tried this on other clusters or other storage classes? I'm unable to reproduce this.

@vkamlov
Copy link
Author

vkamlov commented Jul 1, 2021

I tried on a cluster created with kubeadm k8s v1.15 and rancher k8s v1.15-1.21
I tried rook-ceph-block and portworx
This cluster is temp. I am constantly redeploying it for tests.
All looks good on aws gp2 storage class

@bathina2
Copy link
Contributor

bathina2 commented Jul 1, 2021

Which is why this is perplexing. From within the pod it should seem like the same image with a volume attached at /data.

@vkamlov
Copy link
Author

vkamlov commented Jul 1, 2021

Earlier you offered a call. Where can we try?

@erkerb4
Copy link

erkerb4 commented Sep 20, 2021

@bathina2 & @vkamlov -- Interestingly, I've run into the same problem with v0.4.18 using ceph-block storage with Rook. The FIO tests run and results display successfully when I use Kubestr v0.4.17.

I am running kubestr from CentOS 7 VM, and with a Kubernetes Cluster v1.21.3. I have Rook v1.7 deployed with Block Storage.

@erkerb4
Copy link

erkerb4 commented Sep 20, 2021

Alright, I spoke too soon. I started changing some parameters, and I ran into issues once I've adjusted the runtime parameter on a seq-read fio test:

The FIO test runs without issues with this setting:

[global]
name=fio-seq-reads
filename=fio-seq-reads
rw=read
bs=256K
direct=1
numjobs=1
time_based
**runtime=120**

[file1]
size=10G
ioengine=libaio
iodepth=16

However, the following causes the error condition when job finishes:

[global]
name=fio-seq-reads
filename=fio-seq-reads
rw=read
bs=256K
direct=1
numjobs=1
time_based
**runtime=240**

[file1]
size=10G
ioengine=libaio
iodepth=16

Result:

Running FIO test (fio-seq-read.fio) on StorageClass (ceph-block) with a PVC of Size (100Gi)
Elapsed time- 5m0.034013586s
FIO test results:
  Failed while running FIO test.: Unable to parse fio output into json.: unexpected end of JSON input  -  Error

@bathina2 bathina2 added the invalid This doesn't seem right label Oct 6, 2021
@P-n-I
Copy link

P-n-I commented Oct 9, 2024

I've seen similar because I've had to turn the ioengine=libaio off because it fails on one of my clusters but left iodepth=64 in the config which resulted in:

note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
{
  "fio version" : "fio-3.36",
  "timestamp" : 1728489768,
  .....

Which makes the output from fio not JSON and, therefore, causes errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Development

No branches or pull requests

4 participants