Skip to content

Benchmarks 2024 11 26 TFLM GCC O3 spike_rv32

GitHub Action edited this page Nov 26, 2024 · 1 revision

Setup

Simulator

  • Spike (riscv-isa-sim ) (ISS, CPI=1)
    • Spike : eb0a3e2b0a7c57522928be39de95cd9f8c6dc636
    • Spike PK : fix-gcc14-rvv

Toolchains

Models

Frameworks

  • MLonMCU : develop

  • TFLM : 8eb6b23de4470d6a8da3131650d6a67514dfa130

Miscellaneous

  • Used -Os flag for compilation.
  • Benchmarks generated using MLonMCU deployment tool with minimal efforts.
  • Memory metrics are reported in Bytes

Results (Framework: tflm, Backend: tflmi, Toolchain: gcc, Flags: -O3, Target: spike_rv32 )

Audio Wake Words (aww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
54363848.0
( 0.3x )
148424
( 0.821 )
36144
( 1.0 )
0 TFLM Reference RV32GC 0 -
34476456.0
( 0.4x )
156946
( 0.868 )
36200
( 1.001 )
128 TFLM Reference RV32GCV 0 Loop+SLP
31869936.0
( 0.5x )
157292
( 0.87 )
36204
( 1.001 )
256 TFLM Reference RV32GCV 0 Loop+SLP
31219427.0
( 0.5x )
158376
( 0.876 )
36204
( 1.001 )
512 TFLM Reference RV32GCV 0 Loop+SLP
30534285.0
( 0.5x )
159520
( 0.882 )
36204
( 1.001 )
1024 TFLM Reference RV32GCV 0 Loop+SLP
30144995.0
( 0.5x )
160716
( 0.889 )
36200
( 1.001 )
2048 TFLM Reference RV32GCV 0 Loop+SLP
29803625.0
( 0.5x )
162722
( 0.9 )
36176
( 1.001 )
4096 TFLM Reference RV32GCV 0 Loop+SLP
15060815.0
( Base )
180806
( Base )
36152
( Base )
0 muRISCV-NN Scalar RV32GC 0 -
15088048.0
( 1.0x )
178130
( 0.985 )
36152
( 1.0 )
0 muRISCV-NN Vector (Portable) RV32GC 0 -
8093709.0
( 1.9x )
203012
( 1.123 )
36216
( 1.002 )
128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6685275.0
( 2.3x )
202958
( 1.123 )
36220
( 1.002 )
256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6004861.0
( 2.5x )
209620
( 1.159 )
36220
( 1.002 )
512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5682416.0
( 2.7x )
221040
( 1.223 )
36220
( 1.002 )
1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5497499.0
( 2.7x )
247540
( 1.369 )
36216
( 1.002 )
2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5494123.0
( 2.7x )
302598
( 1.674 )
36192
( 1.001 )
4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
7989645.0
( 1.9x )
183112
( 1.013 )
36152
( 1.0 )
128 muRISCV-NN Vector RV32GCV 0 -
5446551.0
( 2.8x )
183112
( 1.013 )
36152
( 1.0 )
256 muRISCV-NN Vector RV32GCV 0 -
3931921.0
( 3.8x )
183112
( 1.013 )
36152
( 1.0 )
512 muRISCV-NN Vector RV32GCV 0 -
3897348.0
( 3.9x )
183112
( 1.013 )
36152
( 1.0 )
1024 muRISCV-NN Vector RV32GCV 0 -
3897348.0
( 3.9x )
183112
( 1.013 )
36152
( 1.0 )
2048 muRISCV-NN Vector RV32GCV 0 -
3904130.0
( 3.9x )
183112
( 1.013 )
36152
( 1.0 )
4096 muRISCV-NN Vector RV32GCV 0 -
7134408.0
( 2.1x )
200698
( 1.11 )
36216
( 1.002 )
128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
5293706.0
( 2.8x )
200606
( 1.11 )
36220
( 1.002 )
256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
4261290.0
( 3.5x )
207268
( 1.146 )
36220
( 1.002 )
512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
3910675.0
( 3.9x )
218688
( 1.21 )
36220
( 1.002 )
1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
3687391.0
( 4.1x )
245188
( 1.356 )
36216
( 1.002 )
2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
3704454.0
( 4.1x )
300258
( 1.661 )
36192
( 1.001 )
4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Image Classification (resnet)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
172295593.0
( 0.3x )
196904
( 0.905 )
68916
( 1.0 )
0 TFLM Reference RV32GC 0 -
67905087.0
( 0.8x )
210674
( 0.969 )
68980
( 1.001 )
128 TFLM Reference RV32GCV 0 Loop+SLP
53031626.0
( 1.0x )
211398
( 0.972 )
68984
( 1.001 )
256 TFLM Reference RV32GCV 0 Loop+SLP
47125467.0
( 1.2x )
214156
( 0.985 )
68984
( 1.001 )
512 TFLM Reference RV32GCV 0 Loop+SLP
45091198.0
( 1.2x )
216022
( 0.993 )
68984
( 1.001 )
1024 TFLM Reference RV32GCV 0 Loop+SLP
44753023.0
( 1.2x )
218252
( 1.004 )
68980
( 1.001 )
2048 TFLM Reference RV32GCV 0 Loop+SLP
44036761.0
( 1.2x )
221856
( 1.02 )
68956
( 1.001 )
4096 TFLM Reference RV32GCV 0 Loop+SLP
54559765.0
( Base )
217464
( Base )
68908
( Base )
0 muRISCV-NN Scalar RV32GC 0 -
72393197.0
( 0.8x )
216724
( 0.997 )
68908
( 1.0 )
0 muRISCV-NN Vector (Portable) RV32GC 0 -
18041661.0
( 3.0x )
249354
( 1.147 )
68980
( 1.001 )
128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
11533301.0
( 4.7x )
242486
( 1.115 )
68984
( 1.001 )
256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
8332992.0
( 6.5x )
245616
( 1.129 )
68984
( 1.001 )
512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6860235.0
( 8.0x )
247026
( 1.136 )
68984
( 1.001 )
1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
6163500.0
( 8.9x )
248778
( 1.144 )
68980
( 1.001 )
2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
5850418.0
( 9.3x )
251434
( 1.156 )
68956
( 1.001 )
4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
32167007.0
( 1.7x )
224760
( 1.034 )
68908
( 1.0 )
128 muRISCV-NN Vector RV32GCV 0 -
19763011.0
( 2.8x )
224760
( 1.034 )
68908
( 1.0 )
256 muRISCV-NN Vector RV32GCV 0 -
14136501.0
( 3.9x )
224760
( 1.034 )
68908
( 1.0 )
512 muRISCV-NN Vector RV32GCV 0 -
11394884.0
( 4.8x )
224760
( 1.034 )
68908
( 1.0 )
1024 muRISCV-NN Vector RV32GCV 0 -
9218540.0
( 5.9x )
224760
( 1.034 )
68908
( 1.0 )
2048 muRISCV-NN Vector RV32GCV 0 -
8638398.0
( 6.3x )
224760
( 1.034 )
68908
( 1.0 )
4096 muRISCV-NN Vector RV32GCV 0 -
29171367.0
( 1.9x )
248602
( 1.143 )
68980
( 1.001 )
128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
17067706.0
( 3.2x )
241734
( 1.112 )
68984
( 1.001 )
256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
10956464.0
( 5.0x )
244864
( 1.126 )
68984
( 1.001 )
512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
8093209.0
( 6.7x )
246274
( 1.132 )
68984
( 1.001 )
1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
6789234.0
( 8.0x )
248026
( 1.141 )
68980
( 1.001 )
2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
6151842.0
( 8.9x )
250682
( 1.153 )
68956
( 1.001 )
4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Anomaly Detection (toycar)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
2784194.0
( 0.6x )
340620
( 0.978 )
19424
( 1.0 )
0 TFLM Reference RV32GC 0 -
1232013.0
( 1.3x )
343596
( 0.987 )
19428
( 1.0 )
128 TFLM Reference RV32GCV 0 Loop+SLP
835955.0
( 2.0x )
343614
( 0.987 )
19428
( 1.0 )
256 TFLM Reference RV32GCV 0 Loop+SLP
638570.0
( 2.6x )
344052
( 0.988 )
19428
( 1.0 )
512 TFLM Reference RV32GCV 0 Loop+SLP
534694.0
( 3.1x )
344468
( 0.989 )
19428
( 1.0 )
1024 TFLM Reference RV32GCV 0 Loop+SLP
485380.0
( 3.4x )
344926
( 0.991 )
19428
( 1.0 )
2048 TFLM Reference RV32GCV 0 Loop+SLP
460708.0
( 3.6x )
345668
( 0.993 )
19424
( 1.0 )
4096 TFLM Reference RV32GCV 0 Loop+SLP
1648267.0
( Base )
348140
( Base )
19424
( Base )
0 muRISCV-NN Scalar RV32GC 0 -
2728325.0
( 0.6x )
348142
( 1.0 )
19424
( 1.0 )
0 muRISCV-NN Vector (Portable) RV32GC 0 -
726502.0
( 2.3x )
349172
( 1.003 )
19428
( 1.0 )
128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
544912.0
( 3.0x )
348766
( 1.002 )
19428
( 1.0 )
256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
454495.0
( 3.6x )
348998
( 1.002 )
19428
( 1.0 )
512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
409282.0
( 4.0x )
349202
( 1.003 )
19428
( 1.0 )
1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
398968.0
( 4.1x )
349428
( 1.004 )
19428
( 1.0 )
2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
380515.0
( 4.3x )
349810
( 1.005 )
19424
( 1.0 )
4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
1993404.0
( 0.8x )
351934
( 1.011 )
19424
( 1.0 )
128 muRISCV-NN Vector RV32GCV 0 -
1745390.0
( 0.9x )
351934
( 1.011 )
19424
( 1.0 )
256 muRISCV-NN Vector RV32GCV 0 -
1621383.0
( 1.0x )
351934
( 1.011 )
19424
( 1.0 )
512 muRISCV-NN Vector RV32GCV 0 -
1560047.0
( 1.1x )
351934
( 1.011 )
19424
( 1.0 )
1024 muRISCV-NN Vector RV32GCV 0 -
1555722.0
( 1.1x )
351934
( 1.011 )
19424
( 1.0 )
2048 muRISCV-NN Vector RV32GCV 0 -
1555172.0
( 1.1x )
351934
( 1.011 )
19424
( 1.0 )
4096 muRISCV-NN Vector RV32GCV 0 -
1221483.0
( 1.3x )
349174
( 1.003 )
19428
( 1.0 )
128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
808554.0
( 2.0x )
348768
( 1.002 )
19428
( 1.0 )
256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
602899.0
( 2.7x )
349000
( 1.002 )
19428
( 1.0 )
512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
500104.0
( 3.3x )
349204
( 1.003 )
19428
( 1.0 )
1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
460999.0
( 3.6x )
349430
( 1.004 )
19428
( 1.0 )
2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
428147.0
( 3.8x )
349812
( 1.005 )
19424
( 1.0 )
4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Visual Wake Words (vww)

Cycles (Speedup) Total ROM (rel.) Total RAM (rel.) VLEN Kernels Mode Arch Unroll Auto-Vectorization
140036503.0
( 0.3x )
422066
( 0.929 )
134448
( 1.0 )
0 TFLM Reference RV32GC 0 -
77531620.0
( 0.6x )
430588
( 0.947 )
134504
( 1.0 )
128 TFLM Reference RV32GCV 0 Loop+SLP
69387094.0
( 0.7x )
430934
( 0.948 )
134508
( 1.0 )
256 TFLM Reference RV32GCV 0 Loop+SLP
66085799.0
( 0.7x )
432018
( 0.951 )
134508
( 1.0 )
512 TFLM Reference RV32GCV 0 Loop+SLP
64123882.0
( 0.7x )
433162
( 0.953 )
134508
( 1.0 )
1024 TFLM Reference RV32GCV 0 Loop+SLP
63218733.0
( 0.7x )
434358
( 0.956 )
134504
( 1.0 )
2048 TFLM Reference RV32GCV 0 Loop+SLP
62122887.0
( 0.7x )
436364
( 0.96 )
134480
( 1.0 )
4096 TFLM Reference RV32GCV 0 Loop+SLP
45259044.0
( Base )
454448
( Base )
134456
( Base )
0 muRISCV-NN Scalar RV32GC 0 -
45599828.0
( 1.0x )
451772
( 0.994 )
134456
( 1.0 )
0 muRISCV-NN Vector (Portable) RV32GC 0 -
25091654.0
( 1.8x )
476662
( 1.049 )
134520
( 1.0 )
128 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
21026264.0
( 2.2x )
476608
( 1.049 )
134524
( 1.001 )
256 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
19035801.0
( 2.4x )
483270
( 1.063 )
134524
( 1.001 )
512 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
18108838.0
( 2.5x )
494674
( 1.089 )
134524
( 1.001 )
1024 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
17730388.0
( 2.6x )
521174
( 1.147 )
134520
( 1.0 )
2048 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
17581231.0
( 2.6x )
576248
( 1.268 )
134496
( 1.0 )
4096 muRISCV-NN Scalar RV32GCV 0 Loop+SLP
25430449.0
( 1.8x )
456754
( 1.005 )
134456
( 1.0 )
128 muRISCV-NN Vector RV32GCV 0 -
18543026.0
( 2.4x )
456754
( 1.005 )
134456
( 1.0 )
256 muRISCV-NN Vector RV32GCV 0 -
15914790.0
( 2.8x )
456754
( 1.005 )
134456
( 1.0 )
512 muRISCV-NN Vector RV32GCV 0 -
14813451.0
( 3.1x )
456754
( 1.005 )
134456
( 1.0 )
1024 muRISCV-NN Vector RV32GCV 0 -
14723422.0
( 3.1x )
456754
( 1.005 )
134456
( 1.0 )
2048 muRISCV-NN Vector RV32GCV 0 -
14730204.0
( 3.1x )
456754
( 1.005 )
134456
( 1.0 )
4096 muRISCV-NN Vector RV32GCV 0 -
22137934.0
( 2.0x )
474348
( 1.044 )
134520
( 1.0 )
128 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
17072508.0
( 2.7x )
474256
( 1.044 )
134524
( 1.001 )
256 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
14168099.0
( 3.2x )
480918
( 1.058 )
134524
( 1.001 )
512 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
13159922.0
( 3.4x )
492322
( 1.083 )
134524
( 1.001 )
1024 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
12786564.0
( 3.5x )
518822
( 1.142 )
134520
( 1.0 )
2048 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP
12651382.0
( 3.6x )
573908
( 1.263 )
134496
( 1.0 )
4096 muRISCV-NN Vector (Portable) RV32GCV 0 Loop+SLP

Original data

Click here to download the raw files for this benchmark.

2024-11-26
2024-11-21
2024-11-19
2024-11-18
2024-07-12
2024-06-29
2024-03-02
2024-02-26
2024-02-23
2024-02-22
2024-02-20
2024-02-11
2023-12-22
Clone this wiki locally