Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inheritance and metaclass of base ctypes classes #12982

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

tungol
Copy link
Contributor

@tungol tungol commented Nov 7, 2024

The hierarchy of _CData types doesn't much resemble runtime right now. What if it did?

This is still a WIP, mostly because I suspect it might be disruptive but I'm not sure about that. If mypy-primer doesn't completely explode, I can do some fine tuning.

The hierarchy of _CData types doesn't much resemble runtime right now.
What if it did?

This is still a WIP, mostly because I suspect it might be disruptive
but I'm not sure about that. If mypy-primer doesn't completely
explode, I can do some fine tuning.
@tungol tungol marked this pull request as draft November 7, 2024 23:40

This comment has been minimized.

@tungol
Copy link
Contributor Author

tungol commented Nov 7, 2024

Hmmmmm

This comment has been minimized.

This comment has been minimized.

They probably can't be fixed, but I'm not certain enough right now.
@tungol tungol marked this pull request as ready for review November 8, 2024 10:17

This comment has been minimized.

@tungol
Copy link
Contributor Author

tungol commented Nov 8, 2024

Seems good! I'm encouraged by the fact that an actual error showed up in mypy-primer when I had something wrong.

I found it interesting that the comment about how the classmethods on _CData couldn't be modeled properly was wrong because actually they were methods on a metaclass and not classmethods in the first place.

The note on _typeshed.Self says it should only be used for __new__ on metaclasses, but it seemed appropriate to use it for alternate constructor methods on a metaclass as well. I hadn't done typing on a metaclass before, so that was interesting to figure out.

This comment has been minimized.

This comment has been minimized.

@junkmd
Copy link
Contributor

junkmd commented Nov 10, 2024

Hi,

+1 for the potential to resolve the false positives detected by type checkers around the metaclasses used in comtypes, which might allow the removal of type: ignore.

If this PR gets merged, I would appreciate guidance on how projects like comtypes that use "legacy alias"es should update their typing symbols.

This comment has been minimized.

@tungol
Copy link
Contributor Author

tungol commented Nov 10, 2024

Hi @junkmd ,

Unfortunately this particular MR doesn't have much impact on comtypes. I did a local comparison using mypy --strict and there were only two things that meaningfully changed, both additional findings:

> comtypes/_memberspec.py:191: error: "type[_CData]" has no attribute "from_param"  [attr-defined]
> comtypes/test/find_memleak.py:26: error: Too many values to unpack (2 expected, 3 provided)  [misc]

The first is, strictly speaking, correct. _CData at runtime has no from_param method. This is kind of unfortunate, actually, because every subclass of _CData does have this method, and type[_CData] is used here to refer to every subclass of _CData, not _CData itself, actually. It might be worth continuing to lie in the stubs about this to avoid this issue, or if _CData does become strictly accurate in the stubs then maybe setting up a typealias which is a Union of every _CData subclass could provide a replacement for this use case? I might consider this more and see what can be done. While true, this is a worse overall experience with no alternative currently being offered.

The second is from a line like this:

for n, _ in self._fields_:

This is also a correct addition. _fields_ is typically a list of 2-tuples, and on the class in question within comtypes that's true, but in the general case it's valid for _fields_ to include 3-tuples (c.f. https://docs.python.org/3/library/ctypes.html#ctypes.Structure._fields_). It's safe because your class doesn't use any 3-tuple fields, but I don't think there's anything that could be done in the stubs that would allow mypy to realize that.

As for the aliases, that's just CField and CArgObject. These have both been in typeshed for quite a while, with their names prefixed with an underscore because they're not exposed at runtime. @type_check_only provides a better solution for that (in my opinion, anyway) so this MR would update them to match their runtime names (no underscore).

There's no particular need for you to stop using from ctypes import _CArgObject when you need to use this type. This MR would create the option of using from _ctypes import CArgObject instead, which I personally like because a) the name of the type is CArgObject, not _CArgObject and b) I don't like that ctypes re-exports _CArgObject from where it's defined in _ctypes; if it hadn't been re-exported there already I wouldn't have added it myself. But both of these reasons are relatively idiosyncratic and I think the only reasonable guidance is that you're fine to use both CField and CArgObject however you already do for the indefinite future.

This comment has been minimized.

@tungol tungol marked this pull request as draft November 11, 2024 00:05
@tungol
Copy link
Contributor Author

tungol commented Nov 11, 2024

Latest commit adds _CDataType, which encompasses all subclasses of _CData and can be used in place of _CData in typing contexts to capture the common features of those subclasses that aren't on _CData itself.

I also added __ctypes_from_outparam__ which is used by comtypes and was showing up in the mypy diffs. I'm pretty sure the behavior of this function (usually returns Self, but returns _T for _SimpleCData) is related to the lengthy TODO comment on the Array class.

Marking as draft while I check out those mypy-primer results.

This comment has been minimized.

@tungol tungol marked this pull request as ready for review November 11, 2024 03:30
@tungol
Copy link
Contributor Author

tungol commented Nov 11, 2024

I updated to continue to accept _CData typed things in function arguments, while preferentially returning the more precise _CDataType. That solves the diff from comtypes.

I think the pwndbg diff makes sense: they're using the existence of a _length_ attribute as a method of testing whether or not a ctypes object is an array. If it exists, they assume it's an Array and proceed to access _type_ without testing for it separately.

But a struct or union might have a member which is coincidentally named _length_, and that's no guarantee that it will have a _type_ member as well. The exact error message is not the clearest, but this doesn't strike me as a safe pattern in the first place. Prior to this change, both attributes were just getting typed as Any.

@junkmd
Copy link
Contributor

junkmd commented Nov 11, 2024

Hi @tungol,

Thank you for your thorough investigation.

Unfortunately this particular MR doesn't have much impact on comtypes

Nonetheless, such improvements to the type stubs bring the possibility of comtypes static typing more closely with actual behavior in the future.
The fact that the metaclasses now more accurately reflect real behavior and that the __ctypes_from_outparam__ method is defined are especially significant.

Given the shared history that the originator of both ctypes and comtypes is the same person, I believe that improvements related to ctypes will contribute to advancements in comtypes as well.

Once this is merged and incorporated into major type checkers, I intend to work on refining comtypes static typing with the community.

If you have any thoughts on comtypes typing, please feel free to open issues.

Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

pwndbg (https://github.com/pwndbg/pwndbg)
+ pwndbg/aglib/heap/structs.py: note: In member "read_field" of class "CStruct2GDB":
+ pwndbg/aglib/heap/structs.py:175: error: Item "PyCSimpleType" of "type[_SimpleCData[Any]] | type[_Pointer[Any]] | type[CFuncPtr] | type[_ctypes.Union] | type[Structure] | type[Array[Any]]" has no attribute "_type_"  [union-attr]
+ pwndbg/aglib/heap/structs.py:175: error: Access to generic instance variables via class is ambiguous  [misc]
+ pwndbg/aglib/heap/structs.py:175: error: Item "PyCFuncPtrType" of "type[_SimpleCData[Any]] | type[_Pointer[Any]] | type[CFuncPtr] | type[_ctypes.Union] | type[Structure] | type[Array[Any]]" has no attribute "_type_"  [union-attr]
+ pwndbg/aglib/heap/structs.py:175: error: Invalid index type "Any | type[Any] | CField[Any, Any, Any] | overloaded function" for "dict[type[object], Type]"; expected type "type[object]"  [index]
+ pwndbg/aglib/heap/structs.py:177: error: Argument 1 to "array" of "Type" has incompatible type "Any | CField[Any, Any, Any] | overloaded function"; expected "int"  [arg-type]
+ pwndbg/aglib/heap/structs.py: note: In member "fields" of class "CStruct2GDB":
+ pwndbg/aglib/heap/structs.py:206: error: Item "PyCSimpleType" of "type[_SimpleCData[Any]] | type[_Pointer[Any]] | type[CFuncPtr] | type[_ctypes.Union] | type[Structure] | type[Array[Any]]" has no attribute "_type_"  [union-attr]
+ pwndbg/aglib/heap/structs.py:206: error: Access to generic instance variables via class is ambiguous  [misc]
+ pwndbg/aglib/heap/structs.py:206: error: Item "PyCFuncPtrType" of "type[_SimpleCData[Any]] | type[_Pointer[Any]] | type[CFuncPtr] | type[_ctypes.Union] | type[Structure] | type[Array[Any]]" has no attribute "_type_"  [union-attr]
+ pwndbg/aglib/heap/structs.py:206: error: Invalid index type "Any | type[Any] | CField[Any, Any, Any] | overloaded function" for "dict[type[object], Type]"; expected type "type[object]"  [index]
+ pwndbg/aglib/heap/structs.py:207: error: Argument 1 to "array" of "Type" has incompatible type "Any | CField[Any, Any, Any] | overloaded function"; expected "int"  [arg-type]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants