Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TypeIs for types with type params in Unions #17232

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

kreathon
Copy link
Contributor

@kreathon kreathon commented May 10, 2024

Fix TypeIs narrowing for types with type parameters in Union types.

Addresses: #17181

Example:

bar: list[int] | list[str]

if is_list_int(bar):  <- here both Union members are currently erased to list[Any] (which is subtype of the erased TypeIs argument: list[Any])
    reveal_type(bar)
else:                 <- here is nothing left in the Union for this branch  and we get the type Never
    reveal_type(bar)  <- currently this is marked as unreachable

Implementation

My goal was to not split the implementation up (see first commit) into handling of the isinstance and TypeIs, but to use a common implementation.

Before, the code was using is_proper_subtype with erased types:

    supertype = erase_type(supertype)
    if is_proper_subtype(
        erase_type(item), supertype, ignore_promotions=True, erase_instances=True

However, this does no longer work (see example above). The idea is to use is_subtype (which should also be able to handle the isinstance implementation), because e.g. list[int] is subtype of list[Any].

The only problem with this implementation are "trivial" Any cases that should not result in any narrowing. The code tries to manually handle these (I could not find any existing method that would do something similar).

Test Plan

  • add testTypeIsUnionWithTypeParams (test case of the bug report)
  • add testTypeIsAwaitableAny test case, because it is also an example in the PEP. Note that the behavior of this test did not change with this implementation see.
  • add testTypeIsTypeAnytest case, because of pandera mypy_primer output (see below)
  • add testIsinstanceSubclassAny to document current behavior (see)

Primer Output

pandera

New behavior should be an improvement compared to before:

def is_subtype(
    arg1: Union[A, Type[A]],
    arg2: Union[B, Type[B]],
) -> bool:
    """Returns True if first argument is lower/equal in DataType hierarchy."""
    if inspect.isclass(arg1):
        reveal_type(arg1)
        arg1_cls = arg1
    else:
        reveal_type(arg1)
        arg1_cls = arg1.__class__

    arg2_cls = arg2 if inspect.isclass(arg2) else arg2.__class__
    return issubclass(arg1_cls, arg2_cls)

Old:

main.py:17: note: Revealed type is "type[Any]"
main.py:20: note: Revealed type is "Union[__main__.A, type[__main__.A]]"
main.py:21: error: Incompatible types in assignment (expression has type "type[A] | overloaded function", variable has type "type[Any]")  [assignment]
main.py:24: error: Argument 2 to "issubclass" has incompatible type "type[Any] | type[B] | overloaded function"; expected "_ClassInfo"  [arg-type]

New (see added test case):

main.py:19: note: Revealed type is "type[Any]"
main.py:22: note: Revealed type is "__main__.A"

Note

It seems that the narrowing still does not work as excepted in some cases (see the pandera example). The type should be type[__main__.A] instead of type[Any]. However, I think that this is a different issue.

This comment has been minimized.

@kreathon kreathon marked this pull request as draft May 10, 2024 10:56
@JelleZijlstra JelleZijlstra self-assigned this May 10, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@kreathon kreathon marked this pull request as ready for review May 25, 2024 15:55
@kreathon
Copy link
Contributor Author

@JelleZijlstra I am ready for feedback 🙂

This comment has been minimized.

@sterliakov
Copy link
Contributor

I played a bit with this implementation, and noticed an odd discrepancy. I'm not sure if this can/should be fixed here, though.

Cf.

[case testTypeIsListObjectFromList]
# flags: --warn-unreachable
from typing_extensions import TypeIs
from typing import List

def typeis(x: object) -> TypeIs[List[object]]: ...

x = ['a', 'b']
if typeis(x):
    reveal_type(x)  # N: Revealed type is "Never"
    x[1]  # E: Value of type Never is not indexable
else:
    reveal_type(x)  # N: Revealed type is "builtins.list[builtins.str]"

[builtins fixtures/list.pyi]

and

[case testTypeIsListObjectFromTuple]
# flags: --warn-unreachable
from typing_extensions import TypeIs
from typing import List

def typeis(x: object) -> TypeIs[List[object]]: ...

y = ('a', 'b')
if typeis(y):
    reveal_type(y)  # E: Statement is unreachable
    y[1]
else:
    reveal_type(y)  # N: Revealed type is "Tuple[builtins.str, builtins.str]"

[builtins fixtures/list.pyi]

Behaviour in both cases is explainable, but seems inconsistent.

Maybe worse: this does not handle bool <: int and int <: float in the same way. That's probably because bool is defined as class bool(int), while float is special-cased elsewhere. isinstance has the same trap, but x: float = 1 (or x: int = 1; y: float = x, to be sure) succeeds. I don't know whether it is intended behaviour - after all, assert not isinstance(1, float) passes - but this may be confusing for users with PEP484 in mind. This wasn't taken into account in PEP742 (TypeIs), so I think any solution would be acceptable, but some documentation or explicit tests could help.

[case testTypeIsBoolIsInt]
# flags: --warn-unreachable
from typing_extensions import TypeIs

def typeis(x: object) -> TypeIs[int]: ...

y = True
if typeis(y):
    reveal_type(y)  # N: Revealed type is "builtins.bool"
else:
    reveal_type(y)  # E: Statement is unreachable
if isinstance(y, int):
    reveal_type(y)  # N: Revealed type is "builtins.bool"
else:
    reveal_type(y)  # E: Statement is unreachable

[builtins fixtures/isinstancelist.pyi]

[case testTypeIsIntIsFloat]
# flags: --warn-unreachable
from typing_extensions import TypeIs

def typeis(x: object) -> TypeIs[float]: ...

y = 1
if typeis(y):
    reveal_type(y)  # N: Revealed type is "builtins.int"
else:
    reveal_type(y)  # N: Revealed type is "builtins.int"
if isinstance(y, float):
    reveal_type(y)  # N: Revealed type is "builtins.int"
else:
    reveal_type(y)  # N: Revealed type is "builtins.int"

[builtins fixtures/isinstancelist.pyi]

@JelleZijlstra
Copy link
Member

For the int/float case, hopefully we can make mypy switch to the model in python/typing#1748, which should make the behavior a lot more intuitive. I think until that happens, we don't need to worry too much about the behavior in relation to TypeIs.

@kreathon
Copy link
Contributor Author

kreathon commented Jun 6, 2024

Inconsistency of Revealed type is "Never" vs Statement is unreachable.

I see why this is happening: the tuple is filtered out earlier by the is_overlapping_types (resulting in an early unreachable). The list is not filtered out here. Later, while narrowing the variable, it results in Never. I think this also happens in some other cases (which are already documented with test cases). So it might be a more general problem.

I tried to play around with it, but it seems (for me at least) not to be trivial to fix. Either:

  • the find_isinstance_check needs to be improved and cover all the narrowing of the "narrowing-variables" (difficult and redundant)
  • all the narrowing that can happen while visiting the statement body should be able to mark the current block as unreachable when encountering Never
  • push_type_map should be improved to handle the narrowing before visiting the body (possibly the best solution?)

bool <: int vs. int <: float

I also think that this is an issue, but as far as I can tell by your examples @sterliakov that it is consistent with the current isinstance behavior (also no change compared to the master branch). So I guess this is (as mentioned by @JelleZijlstra) a more general problem.

@sterliakov
Copy link
Contributor

I don't consider Never vs unreachable issue critical at all: these cases differ only in error messages in fact, correctly rejecting unexpected checks. This might be a problem for users with --warn-unreachable disabled, though: I think that this flag might deserve a True default.

boot <: int <: float tower is more tough here, but I agree it has nothing to do with TypeIs part if we define TypeIs as "custom isinstance" according to the PEP suggestion. This problem should likely be resolved at some higher level.

This comment has been minimized.

@kreathon
Copy link
Contributor Author

Note that this would also help with: #17599

Copy link
Contributor

github-actions bot commented Oct 8, 2024

Diff from mypy_primer, showing the effect of this PR on open source code:

hydra-zen (https://github.com/mit-ll-responsible-ai/hydra-zen)
- src/hydra_zen/structured_configs/_implementations.py:2601: error: "type" has no attribute "__fields__"  [attr-defined]

pandera (https://github.com/pandera-dev/pandera)
+ pandera/dtypes.py:555: error: Unused "type: ignore" comment  [unused-ignore]

jax (https://github.com/google/jax)
+ jax/_src/tree_util.py:1048: error: Incompatible return value type (got "type[DataclassInstance]", expected "Typ")  [return-value]

@JelleZijlstra
Copy link
Member

The jax change (here: https://github.com/jax-ml/jax/blob/6a958b90b370b81913a0cb228427dd7820c60ccd/jax/_src/tree_util.py#L1048C10-L1048C18) looks suspicious. The variable starts as type Typ but there is an if is_dataclass() branch.

@kreathon
Copy link
Contributor Author

I played around with a fix for the jax change, however it introduced some other primer output regressions ... it's super tricky to get all the Any cases handled correctly.

Unfortunately, I currently have not much time to spend on this issue, so I (of course) do not mind if someone tries to fix my code / approach or takes over with a better idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants