Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX broken for identity function #25763

Open
4 tasks done
cdeln opened this issue Jun 13, 2024 · 6 comments
Open
4 tasks done

ONNX broken for identity function #25763

cdeln opened this issue Jun 13, 2024 · 6 comments
Assignees
Labels
bug category: dnn (onnx) ONNX suport issues in DNN module category: dnn
Milestone

Comments

@cdeln
Copy link

cdeln commented Jun 13, 2024

System Information

OpenCV python version: 4.10.0.82
PyTorch version: 2.0.0+cu117
Operating System / Platform: Ubuntu 22.04
Python version: 3.10.6

Detailed description

I have been having some issues with ONNX files lately.
Decided to check that the simplest function of them all works: the identity ...

Here is a script that shows for what input shapes ONNX is broken (0, 1 and 3 dimensional inputs).
I redirect stdout when exporting the ONNX file to make the output more readable.

I suspect that this is the underlying error to #25762

Steps to reproduce

import torch
import cv2 as cv
from contextlib import redirect_stdout

class Identity(torch.nn.Module):

    def __init__(self):
        super().__init__()

    def forward(self, X):
        return X

model = Identity()
shape = (2,3,5,7,11,13,17,19)

for i in range(1 + len(shape)):
    X = torch.zeros(shape[:i])
    with open('/dev/null', 'w') as f:
        with redirect_stdout(f):
            torch.onnx.export(model, X, '/tmp/model.onnx', output_names=['Y'])

    Y = model(X)
    ok = X.shape == Y.shape
    print('OK   ' if ok else 'ERROR', 'PyTorch', tuple(X.shape), '->', tuple(Y.shape))

    net = cv.dnn.readNetFromONNX('/tmp/model.onnx')
    X = X.numpy()
    net.setInput(X)

    Y = net.forward(['Y'])[0]
    ok = X.shape == Y.shape
    print('OK   ' if ok else 'ERROR', 'OpenCV ', X.shape, '->', Y.shape)

which gives me the following output (format: Dimension, OK/ERROR, PyTorch/OpenCV, SourceShape, TargetShape)

0 OK    PyTorch () -> ()
0 ERROR OpenCV  () -> (1, 1)
1 OK    PyTorch (2,) -> (2,)
1 ERROR OpenCV  (2,) -> (2, 1)
2 OK    PyTorch (2, 3) -> (2, 3)
2 OK    OpenCV  (2, 3) -> (2, 3)
3 OK    PyTorch (2, 3, 5) -> (2, 3, 5)
3 ERROR OpenCV  (2, 3, 5) -> (2, 3)
4 OK    PyTorch (2, 3, 5, 7) -> (2, 3, 5, 7)
4 OK    OpenCV  (2, 3, 5, 7) -> (2, 3, 5, 7)
5 OK    PyTorch (2, 3, 5, 7, 11) -> (2, 3, 5, 7, 11)
5 OK    OpenCV  (2, 3, 5, 7, 11) -> (2, 3, 5, 7, 11)
6 OK    PyTorch (2, 3, 5, 7, 11, 13) -> (2, 3, 5, 7, 11, 13)
6 OK    OpenCV  (2, 3, 5, 7, 11, 13) -> (2, 3, 5, 7, 11, 13)
7 OK    PyTorch (2, 3, 5, 7, 11, 13, 17) -> (2, 3, 5, 7, 11, 13, 17)
7 OK    OpenCV  (2, 3, 5, 7, 11, 13, 17) -> (2, 3, 5, 7, 11, 13, 17)
8 OK    PyTorch (2, 3, 5, 7, 11, 13, 17, 19) -> (2, 3, 5, 7, 11, 13, 17, 19)
8 OK    OpenCV  (2, 3, 5, 7, 11, 13, 17, 19) -> (2, 3, 5, 7, 11, 13, 17, 19)

As you can see, it's broken for 0, 1 and 3 dimensional inputs.

Issue submission checklist

  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
  • I updated to the latest OpenCV version and the issue is still there
  • There is reproducer code and related data files (videos, images, onnx, etc)
@asmorkalov
Copy link
Contributor

@Abdurrahheem could you take a look?

@asmorkalov asmorkalov added the category: dnn (onnx) ONNX suport issues in DNN module label Jun 14, 2024
@fengyuentau
Copy link
Member

fengyuentau commented Jun 17, 2024

Error cases are:

0 ERROR OpenCV  () -> (1, 1)
1 ERROR OpenCV  (2,) -> (2, 1)
3 ERROR OpenCV  (2, 3, 5) -> (2, 3)

Let me explain one by one.

  1. 0 ERROR OpenCV () -> (1, 1): OpenCV Mat does not support 0d, 1d cases; that is to say, when a 0d/1d tensor is loaded into a Mat, it is converted to a Mat of shape (1, 1).
  2. 1 ERROR OpenCV (2,) -> (2, 1): Similar above. In case of 1d tensor of shape (n,), it is Mat of shape (n, 1) (so it breaks the rule of broadcasting) when converted.
  3. 3 ERROR OpenCV (2, 3, 5) -> (2, 3): In the OpenCV Python interface, a three dimensional (a, b, c) tensor is loaded into a Mat of shape (a, b) with channel equals to c. More details see dnn: getting wrong input shape if using python interfaces #23242. This is a legacy problem which should be solved in 5.x I think.

@cdeln
Copy link
Author

cdeln commented Jun 17, 2024

Do you mean that the errors for 0 and 1 dimensions are expected? I feels like a very bad bug to me, and it will leave ONNX integration in a broken state until fixed. So I hope this is not considered expected by OpenCV devs.

OpenCV 4.10 is latest released version, so I am not sure how to test your hypothesis for 5.x without more help. Can you please confirm that this is fixed in 5.x? If that is the case I think it should be back-ported to 4.11 as well.

@fengyuentau
Copy link
Member

Do you mean that the errors for 0 and 1 dimensions are expected? I feels like a very bad bug to me, and it will leave ONNX integration in a broken state until fixed. So I hope this is not considered expected by OpenCV devs.

0d/1d mat support is introduced since 5.x. It was a big patch and breaks some existing things. So maybe not a good idea to do a backport.

@cdeln
Copy link
Author

cdeln commented Jun 17, 2024

@fengyuentau Ok good point. Is the fix for 3d input also not backwards compatible or can it be included in 4.x?

@fengyuentau
Copy link
Member

@fengyuentau Ok good point. Is the fix for 3d input also not backwards compatible or can it be included in 4.x?

The 3d issue is not handle yet in either branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug category: dnn (onnx) ONNX suport issues in DNN module category: dnn
Projects
None yet
Development

No branches or pull requests

5 participants