Skip to content

Bug: parse_wheel_filenames accepts wheel filenames with unsorted compressed tag sets #909

@woodruffw

Description

@woodruffw

Hello packaging maintainers!

I'm filing this as a bug report for some observed behavior.

TL;DR: parse_wheel_filename appears to allow compressed tag sets (defined in PEP 425) to appear in any order, while the PEP and living spec both require the order to be "sorted" (my emphasis below):

To allow for compact filenames of bdists that work with more than one compatibility tag triple, each tag in a filename can instead be a ‘.’-separated, sorted, set of tags. For example, pip, a pure-Python package that is written to run under Python 2 and 3 with the same source code, could distribute a bdist with the tag py2.py3-none-any.

(Permalinks: https://peps.python.org/pep-0425/#compressed-tag-sets and https://packaging.python.org/en/latest/specifications/platform-compatibility-tags/#compressed-tag-sets)

This can be seen by using parse_wheel_filename to parse a wheel whose compressed tags are out of order:

from packaging.utils import parse_wheel_filename

name = "pyvirtualcam-0.13.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl"
parse_wheel_filename(name) 

Actual behavior

The above call yields:

('pyvirtualcam', <Version('0.13.0')>, (), frozenset({<cp310-cp310-manylinux_2_17_x86_64 @ 4370315200>, <cp310-cp310-manylinux2014_x86_64 @ 4370315520>}))

Expected behavior

I expected the above the raise a (subclass of) ValueError, since the compressed tag set appears in the wrong order in the wheel's filename.

The right order would be: pyvirtualcam-0.13.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

Additional context

This was noticed due to an interaction of a few different components in pypi/warehouse#18128:

  1. A user built a wheel (named pyvirtualcam-0.13.0-cp310-cp310-linux_x86_64.whl);
  2. They called auditwheel repair on the wheel, which re-tagged it as pyvirtualcam-0.13.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl). This tag ordering is incorrect per PEP 425, and is being tracked against auditwheel with Compressed tag sets are not sorted when rewriting wheel filename. auditwheel#583;
  3. packaging doesn't warn or fail with the new wheel distribution name, per this bug report;
  4. pypi-attestations then "ultranormalized" the wheel name for signing purposes, fixing it up to pyvirtualcam-0.13.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
  5. Finally, the end user attempted to retrieve their provenance metadata via https://pypi.org/integrity/pyvirtualcam/0.13.0/pyvirtualcam-0.13.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl/provenance and were confused when it wasn't present (since it's actually present at https://pypi.org/integrity/pyvirtualcam/0.13.0/pyvirtualcam-0.13.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl/provenance)

TL;DR: multiple components behaved in slightly different ways here, resulting in maximum confusion:

  1. auditwheel repair produced an incorrect tag ordering;
  2. packaging silently honored that incorrect tag ordering, causing PyPI to also honor it;
  3. pypi-attestations attempted to "fix" it per PEP 425, leading to a correct tag ordering but one that confused the user (since all other components used the invalid-but-accepted one). This last part has been fixed with fix: remove ultranormalization of distribution filenames trailofbits/pypi-attestations#124

See also #873 for a related (but different) issue in wheel filename parsing 🙂

CC @di for viz

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions