Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major update for the xml module #13349

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Major update for the xml module #13349

wants to merge 12 commits into from

Conversation

tungol
Copy link
Contributor

@tungol tungol commented Dec 31, 2024

This MR adds typing for everything in the xml module, which had a lot of incomplete parts still.

I did this by taking a copy of the implementation, copying all of our current stubs inline, and then running mypy --strict on it and fixing all the (many, many) problems. I found a few very old bugs in the process: python/cpython#128302

The implementation in this module is heavily duck-typed, and I've used generic in some possibly questionable ways to accommodate some of that; see some of my comments about it below. Otherwise, I think this is ready for review now.

Related to #6886.

@tungol tungol marked this pull request as draft December 31, 2024 06:01

This comment has been minimized.

@tungol
Copy link
Contributor Author

tungol commented Dec 31, 2024

Not as much noise from mypy-primer as I feared. I'll take a look at those later.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@tungol
Copy link
Contributor Author

tungol commented Jan 1, 2025

The couple new errors in mypy-primer come from using xml.etree.ElementTree.parse() to get an instance of ElementTree, and then calling getroot() on it. The root node of an element tree can be set to None, but creating one with the parse() function guarantees that it's not None. Last commit makes ElementTree generic with a TypeVar of either Element or Element | None so that you get ElementTree[Element] when using parse but ElementTree[Element | None] if you're using it directly. Let's see if that works.

This comment has been minimized.

@tungol
Copy link
Contributor Author

tungol commented Jan 1, 2025

It did work as expected; the remaining issue is the same but they're using defusedxml, which has it's own parse() method that is just annotated as returning ElementTree and I made the default of the typevar Element | None.

It'd be better if the default was Element, but I'm not sure how to make using the class directly default to Element | None while making the default in a type annotation something else. Possibly if I put in a __new__ method instead of the __init__ but I don't really want to do that because it's not actually a method of ElementTree. Maybe it'd be better to just pretend that root is never None and move on.

@tungol tungol marked this pull request as ready for review January 1, 2025 02:36
Copy link
Contributor

github-actions bot commented Jan 2, 2025

Diff from mypy_primer, showing the effect of this PR on open source code:

vision (https://github.com/pytorch/vision)
- torchvision/datasets/voc.py:10: error: Incompatible import of "ET_parse" (imported name has type "Callable[[int | str | bytes | PathLike[str] | PathLike[bytes] | SupportsRead[bytes] | SupportsRead[str], XMLParser | None], ElementTree]", local name has type "Callable[[Any, DefusedXMLParser | None, bool, bool, bool], ElementTree]")  [assignment]
+ torchvision/datasets/voc.py:10: error: Incompatible import of "ET_parse" (imported name has type "Callable[[int | str | bytes | PathLike[str] | PathLike[bytes] | SupportsRead[bytes] | SupportsRead[str], XMLParser[Any] | None], ElementTree[Element[str]]]", local name has type "Callable[[Any, DefusedXMLParser | None, bool, bool, bool], ElementTree[Element[str] | None]]")  [assignment]
+ torchvision/datasets/voc.py:201: error: Argument 1 to "parse_voc_xml" of "VOCDetection" has incompatible type "Element[str] | None"; expected "Element[str]"  [arg-type]

materialize (https://github.com/MaterializeInc/materialize)
- misc/python/materialize/cli/gen-chroma-syntax.py:56: error: Incompatible types in assignment (expression has type "Element | None", variable has type "Element")  [assignment]
+ misc/python/materialize/cli/gen-chroma-syntax.py:56: error: Incompatible types in assignment (expression has type "Element[str] | None", variable has type "Element[str]")  [assignment]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant