Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConvertToChannelsLastAndClean: New transpose nodes don't have names and intermediate tensors are not kept in place #9

Open
4 tasks done
heborras opened this issue Nov 18, 2021 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@heborras
Copy link
Member

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

  • Test that the bug appears on the current version of the main branch. Make sure to include the commit hash of the commit you checked out.
  • Check that the issue hasn't already been reported, by checking the currently open issues.
  • If there are steps to reproduce the problem, make sure to write them down below.
  • If relevant, please include the ONNX files, which were created directly before and/or after the bug.

Quick summary

This is technically two bugs:

  • New Transpose nodes don't have names after the transformation completes
    • This causes issues for hls4ml as all nodes in a graph must be named for proper ingestion
  • After the transformation completes some tensors are not kept in place, but have moved somewhere else
    • This seems to be a side effect of at least the MoveChanLastUpstream transformation not properly maintaining order.

Steps to Reproduce

  1. Clone the qonnx repository
  2. Checkout the main branch
  3. Run to_channels_last utility on the CNV_W2A2.onnx file from the model zoo
  4. Inspect the resulting CNV_W2A2_channels_last.onnx in Netron
  5. The resulting file is also attached: CNV_2W2A_clean_channels_last.zip

Expected behavior

  • All nodes should have names
  • Tensors should stay in-place

Actual behavior

  • The name of the input Transpose is now missing
  • The first Mul node now has an output tensor called "Sub_0_out0", and an input called "Mul_0_out0". So the tensors are not in place anymore.

Optional

Possible fix

Some ideas on fixes:

  • The issue could be hotfixed by running GiveUniqueNodeNames and GiveReadableTensorNames again after the transformation
    • For the naming this might be fine, but ideally the Transposes should have names upon insertion
    • For the tensors moving around, this would not be a clean solution, since it would only hide the fact that the tensors are not in-place anymore
  • Non-hotfix way
    • Give Transposes proper unique names upon insertion
    • Check what's going wrong during Transpose insertion/deletion while moving Transposes up/down
@heborras heborras added the bug Something isn't working label Nov 18, 2021
@maltanar
Copy link
Collaborator

What is meant by "tensors are not in-place anymore" here? What is the expected behavior in regards to tensor names?

@heborras
Copy link
Member Author

What I would expect is that the Output tensor of Mul_0 is still Mul_0_out0 and not Sub_0_out0. A similar thing happens for the Sub_0 node, here the output becomes Quant_9_out0 instead of Sub_0_out0. And interestingly Quant_9 is the next node, so it looks like the tensors got moved up by one position.
Since I'm not touching the names of the tensors during the transformation, these changed names would lead me to believe that some tensors get moved into places where they shouldn't be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants