version 1.2.1 and 1.3.0 issues #120

xvyaward · 2022-08-11T13:47:13Z

Hello, thank you for your team’s awesome work!
I have some questions about using the bolt framework.

Here's my working environment:

target platform: Android-aarch64
build platform: Linux
Device: Arm v8.2+
inference Precision: BNN_FP16
tested bolt version: both 1.2.1 and 1.3.0

when I run the same model on both versions (1.2.1 and 1.3.0) using their X2bolt and benchmark,
the final latency results were almost the same but the compositions (statistics time report) were different.

[ version 1.2.1 ]

[ version 1.3.0 ]

Both cases run under loops=1 option but the statistics time of version 1.2.1 seems to be the result of running 10 times. Is it normal?

I made some custom network, which has a structure like the following:

This network works with version 1.3.0, but not with 1.2.1 because X2bolt of ver 1.2.1 doesn't work properly.

[ X2bolt debug log (version 1.2.1) ]

[ X2bolt debug log (version 1.3.0) ]

as we can see, ver 1.2.1 X2bolt cannot detect an inputs tensor of Mul_21 so I guess benchmark program stops at

bolt/inference/engine/src/cnn.cpp

Line 696 in 4bdc81e

    
           std::vector<std::string> curOpInputTensorName = this->operatorTensorMap[opName][0];

(or nearby, checked with debug options).

In the case of Mul_21, it is executed after the whole left path of the above graph image, so it is expected that it was difficult to reuse the result of ReduceMean op. Of course, there is no problem with the latest version of X2bolt. Is there a way to solve this in the previous version as well?

The reason I use the previous version of the bolt framework is that I saw a significant difference in latency results depending on the version for a specific network model (e.g. Real-to-Binary network, https://arxiv.org/pdf/2003.11535.pdf?fname=cm&font=TypeI).

[ version 1.2.1 ]

[ version 1.3.0 ]

I wonder if this faster output of ver 1.2.1 is kind a reporting bug in version 1.2.1, or a possible result by the implementation difference.

Thank you for reading my long issue and I look forward to your answers.

yuxianzhi · 2022-08-15T01:52:00Z

Hi, maybe v1.2.1 time statistics contains wam up time, we will run some times before real inference. and v1.3.0 may not contains warmup time, only contains real inference time.

If you don't want to use warm up, you can set parameter -w to 0.

xvyaward · 2022-08-24T12:03:44Z

Thank you for your answer. The warmup option was the answer to question 1.
Meanwhile, can I get an answer or hint for question 2? It's an error that is blocking my progress.

yuxianzhi · 2022-08-25T11:48:57Z

Thank you for your answer. The warmup option was the answer to question 1. Meanwhile, can I get an answer or hint for question 2? It's an error that is blocking my progress.

can you show me your command and all log?

xvyaward · 2022-08-25T12:19:00Z

All logs and related necessary information are summarized in the following link:
https://sweetsour.notion.site/Bolt-9eea4d1a73694203a64b26f21b4e8cb6

I've been debugging this issue a bit more and I'm guessing it's a MemoryReuseOptimizer related issue. https://github.com/huawei-noah/bolt/blob/master/model_tools/include/OPOptimizers/MemoryReuseOptimizer.hpp

According to the log of X2bolt, it was confirmed that the reuse_position of the data to be reused was overridden by other data.

yuxianzhi · 2022-08-29T01:34:46Z

this is caused by bolt's onnx model converter, https://github.com/huawei-noah/bolt/blob/master/model_tools/src/onnx/onnx_adaptee.h, we map C = A * B to Scale operator, we assume that Scale operator's weight is tensor B, so if tensor A is weight, there will be an error. can not find an valid input tensor.

xxx OT_Scale | -> output

So maybe you can swap your mul order C = weight * input => C = input * weight

Scale operator's performance is better than Eltwise in bolt, because there is a redundant code to process Eltwise’s bcast mode. From firgure we can see that in v1.3 version, more operators is mapped to Eltwise, maybe we can fix it in bolt's onnx model converter or bolt's tensor_computing or inference engine module(switch some Eltwise to Scale computation). https://github.com/huawei-noah/bolt/blob/master/inference/engine/include/cpu/eltwise_cpu.hpp

yuxianzhi · 2022-08-29T01:37:07Z

Sorry, I am a little late to reply, maybe you can joint Bolt's QQ group 833345709 or contact my wechat cos_wave.

xvyaward · 2022-09-01T04:08:26Z

this is caused by bolt's onnx model converter, https://github.com/huawei-noah/bolt/blob/master/model_tools/src/onnx/onnx_adaptee.h, we map C = A * B to Scale operator, we assume that Scale operator's weight is tensor B, so if tensor A is weight, there will be an error. can not find an valid input tensor.
xxx OT_Scale | -> output
So maybe you can swap your mul order C = weight * input => C = input * weight

Scale operator's performance is better than Eltwise in bolt, because there is a redundant code to process Eltwise’s bcast mode. From firgure we can see that in v1.3 version, more operators is mapped to Eltwise, maybe we can fix it in bolt's onnx model converter or bolt's tensor_computing or inference engine module(switch some Eltwise to Scale computation). https://github.com/huawei-noah/bolt/blob/master/inference/engine/include/cpu/eltwise_cpu.hpp

Thanks, this solved the problem.
I never expected the order of operands to be an issue.
Again, thanks for providing a useful library :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

version 1.2.1 and 1.3.0 issues #120

version 1.2.1 and 1.3.0 issues #120

xvyaward commented Aug 11, 2022

yuxianzhi commented Aug 15, 2022

xvyaward commented Aug 24, 2022 •

edited

Loading

yuxianzhi commented Aug 25, 2022

xvyaward commented Aug 25, 2022

yuxianzhi commented Aug 29, 2022

yuxianzhi commented Aug 29, 2022

xvyaward commented Sep 1, 2022 •

edited

Loading

version 1.2.1 and 1.3.0 issues #120

version 1.2.1 and 1.3.0 issues #120

Comments

xvyaward commented Aug 11, 2022

yuxianzhi commented Aug 15, 2022

xvyaward commented Aug 24, 2022 • edited Loading

yuxianzhi commented Aug 25, 2022

xvyaward commented Aug 25, 2022

yuxianzhi commented Aug 29, 2022

yuxianzhi commented Aug 29, 2022

xvyaward commented Sep 1, 2022 • edited Loading

xvyaward commented Aug 24, 2022 •

edited

Loading

xvyaward commented Sep 1, 2022 •

edited

Loading