Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support decoding list/indefinite list from cbor #331

Draft
wants to merge 53 commits into
base: main
Choose a base branch
from

Conversation

theeldermillenial
Copy link
Contributor

tl;dr:
This PR adds support for proper parsing of fixed vs indefinite lists from CBOR.

Background
Currently, pycardano uses a custom encoder to embed indefinite lists when encoding classes to cbor, even if the length of the list is less than 31 elements.

This creates an issue when decoding with cbor2, because cbor2 decodes all lists to the list class without any indication of whether the list was encoded as a fixed/indefinite list. This loss of information is problematic because hashes are frequently generated off of CBOR, so encoding as a fixed list vs indefinite list generates a different hash even if the content of the datum is the same.

Implementation Detail
This PR is unfortunately dependent on a minor fix in cbor2. The CBORDecoder class doesn't have a custom decoding mechanism to make modifications to how data is decoded. One approach to fixing this issue would be to subclass CBORDecoder to intervene when an indefinite list is found. However, the CBORDecoder was implemented in such a way that it relies on a global dictionary that references internal method definitions on a class, making subclassing challenging. To resolve this issue a PR was opened to cbor2 that simply moves the global dictionaries inside the CBORDecoder class, which has no impact on any other section of code:
agronholm/cbor2#225

This PR takes advantage of that PR to subclass the CBORDecoderand intervene in the case of an indefinite list. When an indefinite list is discovered, it is cast to IndefiniteList and returned, while all other lists are returned as a standard list. This required some additional changes, specifically a change how data is being decoded on the pycardano side.

This should fix a few different Issues, and fixed a previously undiscovered issue related to roundtrip cbor encoding of datums (datum -> cbor ->datum -> cbor).

fixes #311
fixes #330

@codecov-commenter
Copy link

codecov-commenter commented Mar 14, 2024

Codecov Report

Attention: Patch coverage is 81.57895% with 7 lines in your changes missing coverage. Please review.

Project coverage is 83.92%. Comparing base (3d65c67) to head (b06c153).

Files Patch % Lines
pycardano/serialization.py 87.09% 1 Missing and 3 partials ⚠️
pycardano/witness.py 0.00% 0 Missing and 2 partials ⚠️
pycardano/hash.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #331      +/-   ##
==========================================
- Coverage   83.98%   83.92%   -0.06%     
==========================================
  Files          29       29              
  Lines        3733     3757      +24     
  Branches      941      946       +5     
==========================================
+ Hits         3135     3153      +18     
- Misses        433      436       +3     
- Partials      165      168       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@theeldermillenial
Copy link
Contributor Author

I should mention I found a few other things as well.

Most importantly, this should not be merged in its current state. It might be appropriate to convert this to WIP until the cbor2 fix is merged.

There seemed to be type errors in code I did not touch. It might be related to the changes I made, but it could also be due to me updating packages.

@nielstron
Copy link
Contributor

pycardano/plutus.py:77: error: Incompatible types in assignment (expression has type "list[Any]", target has type "bytes") [assignment]
Installing missing stub packages:
pycardano/plutus.py:565: error: Return type "CBORTag" of "to_shallow_primitive" incompatible with return type "list[bytes | bytearray | ByteString | str | int | float | Decimal | bool | tuple[Any, ...] | list[Any] | IndefiniteList | dict[Any, Any] | defaultdict[Any, Any] | OrderedDict[Any, Any] | UndefinedType | datetime | Pattern[Any] | CBORSimpleValue | CBORTag | set[Any] | frozenset[Any] | frozendict[Any, Any] | FrozenList[Any] | IndefiniteFrozenList | None]" in supertype "ArrayCBORSerializable" [override]
pycardano/plutus.py:791: error: Return type "PlutusData | dict[Any, Any] | int | bytes | IndefiniteList | RawCBOR | CBORTag" of "to_primitive" incompatible with return type "bytes | bytearray | ByteString | str | int | float | Decimal | bool | tuple[Any, ...] | list[Any] | IndefiniteList | dict[Any, Any] | defaultdict[Any, Any] | OrderedDict[Any, Any] | UndefinedType | datetime | Pattern[Any] | CBORSimpleValue | CBORTag | set[Any] | frozenset[Any] | frozendict[Any, Any] | FrozenList[Any] | IndefiniteFrozenList | None" in supertype "CBORSerializable" [override]
pycardano/cip/cip8.py:120: error: Incompatible types in assignment (expression has type "dict[str, Any]", variable has type "str") [assignment]
pycardano/transaction.py:296: error: Incompatible types in assignment (expression has type "DatumHash | Any", variable has type "CBORTag") [assignment]
/home/runner/.cache/pypoetry/virtualenvs/pycardano-XSOAH-2F-py3.10/bin/python -m pip install types-cachetools types-requests

There seem to be some type errors in the static analyzer. Either there is a bug or the analyzer needs more/better type hints

@nielstron nielstron changed the title Support decoding list/indefinite list from cbor [WIP] Support decoding list/indefinite list from cbor Mar 18, 2024
@nielstron nielstron marked this pull request as draft March 18, 2024 09:58
@nielstron nielstron changed the title [WIP] Support decoding list/indefinite list from cbor Support decoding list/indefinite list from cbor Mar 18, 2024
@theeldermillenial
Copy link
Contributor Author

Yes, that is exactly what I was talking about. I'm not entirely sure how those type errors got injected into this with my changes.

On the bright side, I do know this works because I've been using it.

I'm trying to get the attention of the cbor2 maintainer to merge the fix.

Not sure what is happening in datum-options - the types seem wrong?
@nielstron
Copy link
Contributor

I guess suddenly the return type of cbor2.encode which was previously untyped came into scope and unveiled a bunch of mismatching types :)

theeldermillenial and others added 25 commits September 1, 2024 20:30
Refactor Redeemer handling in TransactionWitnessSet
@cffls
Copy link
Collaborator

cffls commented Nov 2, 2024

Revisiting this PR.. @theeldermillenial is it still WIP, or we can prepare for a merge?

@theeldermillenial
Copy link
Contributor Author

@cffls Unfortunately still WIP. The maintainer of cbor2 wants to make sure that the fix I made has feature parity with the C extension. Unfortunately, that is something that will take more time than I currently have.

I am going to be adding someone to my team, and hopefully they can try to tackle this or free up enough of my time to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants