-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pre_explicit_broadcast should not expend scalar tensor #573
Comments
It is a pain in the ass to generate and test the ONNX myself for your request. Please provide me with an ONNX file, even a partial model. I know all I have to do is write a few lines of code in PyTorch. But even that is a pain in the ass. |
For sure, here is an ONNX file to reproduce: https://mythic.box.com/s/wmwsrpd41vpe27pqv1k3c7z3lb2daxqr |
The tool has an option to optimize for GPU Delegate, but I noticed a bug in Gemm's implementation during testing, so I fixed it and released the tool. v1.19.8 |
Thanks for the fix, Unfortunately, I don't want to use |
I think you have misunderstood something, Lines 194 to 205 in 6d0c3d8
# cast to attribute data type
x = tf.cast(x, tf.float32)
y = tf.cast(y, tf.float32)
z = tf.cast(z, tf.float32)
if not optimization_for_gpu_delegate:
if z is not None:
result = tf.convert_to_tensor(alpha) * tf.matmul(x, y) + tf.convert_to_tensor(beta) * z #<----- here
else:
result = tf.convert_to_tensor(alpha) * tf.matmul(x, y) + tf.convert_to_tensor(beta)
else:
result = tf.convert_to_tensor(alpha) * tf.matmul(x, y) - (tf.convert_to_tensor(beta) * z) * tf.convert_to_tensor(-1.0, dtype=z.dtype) It is a debug printout that proves the conversion as a scalar.
I know from the beginning that GPU Delegate cannot handle bias in more than 2 dimensions. However, there is a bug in tflite converter that does not produce scalar bias. So I implemented a workaround that I don't like: the GPU optimization option. You should issue an issue to the NNAPI repository or the TensorFlow repository. Btw, I have already posted that discussion on a TensorFlow issue over a year ago, but it has been ignored. Good luck. |
I ran an ONNX model that does not have a At my original ONNX graph, if I made |
It's too strange... It will take some time to do a little research. I can only guess at this point, but it is possible that the tflite optimizer is doing something when it optimizes the next Mul, rather than a Gemm conversion issue. Somehow, I could roughly guess what the optimizer was doing. repro_new_onnx_model_v2.onnx.zip repro_new_onnx_model_v2_1.onnx.zip repro_new_onnx_model_v2_1_float32.tflite.zip PS: |
Try it. Fixes: https://github.com/PINTO0309/onnx2tf/releases/tag/1.19.10 |
Thanks for the quick fix. The fix won't work if the graph has a series of Here is an ONNX file with Gemm -> Mul -> Div: https://mythic.box.com/s/i6ginfybruy1ibeg5pmzluxh6vgj9ye3 |
I understood. However, I am skeptical that we really need to automatically support that pattern in onnx2tf. I understand that a clean conversion would be great for everyone, but the two consecutive scalar Mul and Div should be able to be pre-fused before exporting to ONNX. I don't know how you are generating your model, but wouldn't you just calculate |
This is a ONNX model which relies on ONNXRuntime to do graph optimization (including fusion) but ONNXRuntime doesn't export the optimized model as their optimization is done in memory only. I might be able to trace its pytorch model and doing the fusion there where the operators are separated for different quantization needs per hardware backends, although it becomes irrelevant for float32 in the final tflite model. Is |
The I'll look into the Idea note: search for all consecutive scalar operations behind |
I think the scalar consecutive pattern searching method would unfortunately have some complexity. I looked at the code change: 621bd27 And it seems the original intent is not for scalar tensor, so perhaps a workaround is to have an extra user-specified option to not extend the tensor for Mul/Div nodes etc. But again, it's a user defined option which I understand it adds extra maintenance liability. I'm fine to close the issue as is, and I will try to clean the graph on my end. |
Issue Type
Feature Request
OS
Linux
onnx2tf version number
1.19.7
onnx version number
1.15.0
onnxruntime version number
1.16.3
onnxsim (onnx_simplifier) version number
0.4.33
tensorflow version number
2.15.0
Download URL for ONNX
Parameter Replacement JSON
none
Description
The converted tflite graph cannot run through ArmNN with Arm Compute Library GPU (which is faster than tflite on native CPU.)
For ONNX nodes:
Input -> Gemm -> Mul (with scalar)
and use batch=1:The converted tflite graph contains a fully connected layer with a bias of two dimensions while originally the bias is one dimension:
![Screen Shot 2024-01-17 at 11 47 55 PM](https://private-user-images.githubusercontent.com/1052327/297663752-0f5c0912-452c-43fb-b287-ef2c0a312f78.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk5NzcxMDYsIm5iZiI6MTcxOTk3NjgwNiwicGF0aCI6Ii8xMDUyMzI3LzI5NzY2Mzc1Mi0wZjVjMDkxMi00NTJjLTQzZmItYjI4Ny1lZjJjMGEzMTJmNzgucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcwMyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MDNUMDMyMDA2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZGViY2Y2YzcwZDUwNDZkOTI0ZTg4NTVlZDI0OGU0NzZiYmNkODcxNWQ0NzIyNmI0NmVjMTI3OGVmYmE2MDE2MiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.DJjlbzN6o64sFlcV41y9nN6LSEY9yRtdqV8ZpZRrPHo)
Since the new dimension is 1, mathematically the converted tflite graph is still correct. Unfortunately, Arm Compute Library currently errors out if the bias has more than one dimension: https://github.com/ARM-software/ComputeLibrary/blob/c2a79a4b8c51ce835eaf984f3a1370447b3282c4/src/cpu/operators/CpuFullyConnected.cpp#L431
pre_explicit_broadcast
https://github.com/PINTO0309/onnx2tf/blob/4e8f29129e6ea03f45f9d6520cc077522a790249/onnx2tf/utils/common_functions.py#L845:L858. Can this function be changed to do nothing if the tensor is a scalar?The text was updated successfully, but these errors were encountered: