-
Notifications
You must be signed in to change notification settings - Fork 34
Benchmarks
Eduardo Bart edited this page Dec 19, 2024
·
22 revisions
Benchmark | Host | QEMU | Cartesi Machine | Benchmark command |
---|---|---|---|---|
CPU registers | 1 | 1.18 ± 0.20 | 5.30 ± 0.13 | stress-ng --regs 1 --regs-ops 10000 |
Zlib compression | 1 | 4.16 ± 0.10 | 7.83 ± 0.10 | stress-ng --zlib 1 --zlib-ops 20 |
Forking | 1 | 8.28 ± 0.22 | 9.36 ± 0.23 | stress-ng --fork 1 --fork-ops 1000 |
Naive loop | 1 | 4.15 ± 0.15 | 11.26 ± 0.39 | stress-ng --cpu 1 --cpu-method loop --cpu-ops 400 |
Memory read/write | 1 | 5.36 ± 0.38 | 12.35 ± 0.86 | stress-ng --memrate 1 --memrate-bytes 2M --memrate-ops 200 |
Heapsort | 1 | 7.14 ± 0.11 | 12.62 ± 0.31 | stress-ng --heapsort 1 --heapsort-ops 3 |
Fibonacci | 1 | 4.97 ± 0.07 | 15.32 ± 0.17 | stress-ng --cpu 1 --cpu-method fibonacci --cpu-ops 400 |
Linux syscalls | 1 | 1.59 ± 0.15 | 15.43 ± 1.41 | stress-ng --syscall 1 --syscall-ops 4000 |
Checksum hashes | 1 | 7.21 ± 0.17 | 15.81 ± 0.40 | stress-ng --hash 1 --hash-ops 40000 |
Cache thrashing | 1 | 10.36 ± 2.20 | 15.83 ± 1.06 | stress-ng --cache 1 --cache-ops 100000 |
Disk writes | 1 | 8.23 ± 0.44 | 17.36 ± 0.95 | stress-ng --hdd 1 --hdd-ops 2000 |
Quicksort | 1 | 10.74 ± 0.22 | 17.78 ± 0.36 | stress-ng --qsort 1 --qsort-ops 5 |
TLB shootdowns | 1 | 11.16 ± 0.90 | 18.12 ± 1.08 | stress-ng --tlb-shootdown 1 --tlb-shootdown-ops 2000 |
Memory allocation | 1 | 15.65 ± 0.89 | 19.65 ± 1.12 | stress-ng --malloc 1 --malloc-ops 40000 |
Integer arithmetic | 1 | 13.85 ± 0.47 | 25.60 ± 0.86 | stress-ng --cpu 1 --cpu-method int64 --cpu-ops 400 |
Memory copy | 1 | 11.57 ± 0.40 | 32.60 ± 5.21 | stress-ng --memcpy 1 --memcpy-ops 50 |
Instruction cache thrashing | 1 | 26.68 ± 0.72 | 36.64 ± 1.00 | stress-ng --icache 1 --icache-ops 200 |
SHA-256 | 1 | 19.23 ± 1.21 | 37.95 ± 2.22 | stress-ng --crypt 1 --crypt-method SHA-256 --crypt-ops 400000 |
Floating-point math | 1 | 24.18 ± 0.87 | 41.85 ± 1.59 | stress-ng --fp 1 --fp-method floatadd --fp-ops 1000 |
Floating-point vector math | 1 | 25.47 ± 2.11 | 53.08 ± 4.37 | stress-ng --vecfp 1 --vecfp-ops 100 |
Floating-point matrix multiplication | 1 | 33.66 ± 3.21 | 62.17 ± 5.92 | stress-ng --matrix 1 --matrix-method mult --matrix-ops 20000 |
Floating-point fused multiply add | 1 | 30.16 ± 2.16 | 63.73 ± 4.54 | stress-ng --fma 1 --fma-ops 40000 |
Floating-point trigonometric math | 1 | 38.37 ± 4.28 | 81.17 ± 9.05 | stress-ng --trig 1 --trig-ops 50 |
Integer vector arithmetic | 1 | 18.95 ± 1.51 | 112.91 ± 8.81 | stress-ng --vecmath 1 --vecmath-ops 100 |
Floating-point square root | 1 | 62.32 ± 9.64 | 179.88 ± 27.6 | stress-ng --cpu 1 --cpu-method sqrt --cpu-ops 20 |
How to read: All numbers are relative speed to the same benchmark run on the host, for example 5.30 + 0.13 means the benchmark on the host was 5.30 times faster than in the guest virtual machine, with a standard deviation of 0.13.
- Both QEMU and Cartesi Machine used the same guest kernel and guest rootfs.
- QEMU is faster because it has a JIT (just in time compilation)
- Floating-point benchmarks are slow because of software floating point emulation
- Vector math benchmark is slow because the guest CPU has no support for SIMD instructions while the host has
- Square root benchmark is the worst because it's the heaviest instruction in the Cartesi Machine
- Cartesi Machine can be 5.3x - 179.88x slower than the host, with a median of 18x, depending on the workload.
- Cartesi Machine can be between 1.13x - 9.70x slower than QEMU, with a median of 2x, that is pretty good considering there is no JIT.
- CPU registers benchmark is the fastest, meaning read and writes of RISC-V general purpose registers is fast.
- Square root of floating-point numbers is the slowest benchmark, because it's the only instruction in the RISC-V interpreter that performs a loop.
- SHA-256 and integer vector arithmetic are noticeable slower because there is no support for SIMD instructions.
- Floating-point benchmarks are noticeable slower because of the deterministic software emulation in the RISC-V interpreter.
- QEMU 9.1.2
- Host CPU x86_64 Intel Core i9-14900K
- Host Linux 6.6.65-1-lts
- Guest Linux 6.5.13-ctsi-1
- GCC 14.2.1 20240910
- stress-ng 0.17.06
- Cartesi Machine Emulator 0.19.0