Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize is_unsafe_overlapping_overload_signatures for overlapping typevars case #18159

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

alexdrydew
Copy link
Contributor

Fixes #18022

This change optimizes is_unsafe_overlapping_overload_signatures so overloads containing many type variables can be checked faster.

Let:

  • $N$ - number of function overloads
  • $K$ - total number of type variables in function definition
  • $V$ - max number of constraints in type variables

It seems that currently typechecking a functions with overloads have $O(N^2V^{2K})$ time complexity:

  • we need to check each pair of overrides
  • then we expand both signatures to at most $V^K$ variants

If I am not mistaken (please correct me if I am) it is not necessary to check pairs of overrides where same type variable is expanded to different variants across different signatures, so we can expand variables jointly for both signatures reducing time complexity to $O(N^2V^K)$

On my original example (using 6 type variables) this change gives 40x time improvement:

Example code snippet
from __future__ import annotations

from typing import Generic, TypeVar, overload

T1 = TypeVar("T1", int, str, bool)
T2 = TypeVar("T2", int, str, bool)
T3 = TypeVar("T3", int, str, bool)
T4 = TypeVar("T4", int, str, bool)
T5 = TypeVar("T5", int, str, bool)
T6 = TypeVar("T6", int, str, bool)

class A(Generic[T1, T2, T3, T4, T5, T6]):
    @overload
    def foo(self: A[int, T2, T3, T4, T5, T6]) -> int: ...

    @overload
    def foo(self: A[str, T2, T3, T4, T5, T6]) -> str: ...

    def foo(self) -> int | str:
        raise NotImplementedError
py-spy results without change

profile

py-spy results after change

profile_new

This comment has been minimized.

@alexdrydew alexdrydew marked this pull request as ready for review November 16, 2024 17:20
@sterliakov
Copy link
Contributor

This looks like an important performance fix, but unfortunately seems to change the mypy behaviour significantly. Two testcases based on your example:

from __future__ import annotations

from typing import Generic, TypeVar, overload

T1 = TypeVar("T1", int, str, bool)
T2 = TypeVar("T2", int, str, bool)
T3 = TypeVar("T3", int, str, bool)
T4 = TypeVar("T4", int, str, bool)

class A(Generic[T1, T2, T3, T4]):
    @overload
    def foo(self: A[int, T2, T3, T4]) -> int: ...

    @overload
    def foo(self: A[int, T2, T3, bool]) -> str: ...

    @overload
    def foo(self: A[str, T2, T3, T4]) -> str: ...

    def foo(self) -> int | str:
        return 0

And

from __future__ import annotations

from typing import Generic, TypeVar, overload

T1 = TypeVar("T1", int, str, bool)
T2 = TypeVar("T2", int, str, bool)
T3 = TypeVar("T3", int, str, bool)
T4 = TypeVar("T4", int, str, bool)

class A(Generic[T1, T2, T3, T4]):
    @overload
    def foo(self: A[int, T2, T3, bool]) -> str: ...

    @overload
    def foo(self: A[int, T2, T3, T4]) -> int: ...

    @overload
    def foo(self: A[str, T2, T3, T4]) -> str: ...

    def foo(self) -> int | str:
        return 0

(they only differ in overloads order)

Current mypy master correctly emits

error: Overloaded function signatures 1 and 2 overlap with incompatible return types

in both cases. With this patch the first one is green (which is invalid) and the second one emits

error: Overloaded function signatures 1 and 2 overlap with incompatible return types
note: Flipping the order of overloads will fix this error

This comment has been minimized.

@alexdrydew
Copy link
Contributor Author

@sterliakov
Thank you for finding this case. It seems that the problem is deeper than I've initially thought.

Initially I thought that we get overload-overlap error for your provided examples because of T4: bool variant, but it turns out that the real source of the error is bool being subtype of int. It can be simplified to this example:

from typing import Generic, TypeVar, Union, overload


class A:
    pass


class B(A):
    pass


T1 = TypeVar("T1", A, B)
T2 = TypeVar("T2", A, B)


class T(Generic[T1, T2]):
    @overload
    def foo(self: T[T1, T2]) -> int: ...

    @overload
    def foo(self: T[T1, A]) -> str: ...  # same for flip signatures case

    def foo(self) -> Union[int, str]:
        raise NotImplementedError

Actual variant that triggers the error:

@overload
def foo(self: T[A, B]) -> int: ...

@overload
def foo(self: T[B, A]) -> str: ...
alternative example where `B` is not a subtype of `A` does not raise any errors
from typing import Generic, TypeVar, Union, overload


class A:
    pass


class B:
    pass


T1 = TypeVar("T1", A, B)
T2 = TypeVar("T2", A, B)


class T(Generic[T1, T2]):
    @overload
    def foo(self: T[T1, T2]) -> int: ...

    @overload
    def foo(self: T[T1, A]) -> str: ...  # same for flip signatures case

    def foo(self) -> Union[int, str]:
        raise NotImplementedError

It seems that mypy behavior here (at least according to testOverloadedPartiallyOverlappingTypeVarsAndUnion test case) is to assume that self type is covariant and to allow overload signatures to shadow later narrower signatures for some variants of generic.

I am actually not sure what is the correct behavior here. Currently this one seems most consistent to me:

  • all overloads are checked jointly, so each type variable can have only one value at a time across all overloads (I don't think that we should match against multiple variants of generic self when calling method at any point because we should already have specific type)
  • unsafe overlap error may only be triggered when there are partially overlapping signature variants and the earlier signature arguments are strictly narrower than the later ones but return types are incompatible (otherwise the second incompatible signature is considered shadowed). This case is actually triggered in your second example.

Maybe it is a good idea to allow only strictly narrower function signature variants to shadow later ones to prevent cases when there is a pair of signature variants with exactly same arguments but incompatible return types? I've changed the last check in is_unsafe_overlapping_overload_signatures to handle this edge case in my last commit, but as you may see even that already changes behavior of at least one test case.

@sterliakov
Copy link
Contributor

sterliakov commented Nov 18, 2024

Maybe it is a good idea to allow only strictly narrower function signature variants to shadow later ones to prevent cases when there is a pair of signature variants with exactly same arguments but incompatible return types?

Could you illustrate this with some example? As far as I understand, you suggest the following to become allowed:

from typing import overload

class A: pass
class B(A): pass

@overload
def fn(x: B) -> str: ...
@overload
def fn(x: A) -> float: ...
def fn(x: A) -> str | float:
    return "" if isinstance(x, B) else 0

It's quite important to report overloads-overlap here since otherwise the following (safe) upcasting results in unsafe code that raises at runtime and reports no mypy errors:

def handle(t: A) -> float:
    return fn(t) + 1

handle(B())

BTW, there's a bug in current mypy (#10143): if I replace A and B with float and int resp., the error isn't reported... Also note #10157.

@alexdrydew
Copy link
Contributor Author

Could you illustrate this with some example? As far as I understand, you suggest the following to become allowed

Not really, sorry for the confusion, I meant that strictly wider in terms of the arguments earlier overload signatures should shadow the later ones (emphasis on strictly here, otherwise it is just the current behavior), so:

  • current variant of testOverloadedPartiallyOverlappingTypeVarsAndUnion is still valid:
@overload
def g(x: Union[B, C]) -> int: ...
@overload
def g(x: B) -> str: ...
def g(x): pass
  • but testOverloadOnExactSameTypeVariantWithIncompetibleReturnTypes is now invalid due to the T: B variant
class A: pass
class B: pass
T = TypeVar('T', A, B)
@overload
def f(x: T) -> int: ...  # E: Overloaded function signatures 1 and 2 overlap with incompatible return types
@overload
def f(x: B) -> str: ...
def f(x): ...
  • but this is still valid (as in testOverloadsSafeOverlapAllowed)
T = TypeVar('T', A, B)
@overload
def f(x: T) -> int: ...
@overload
def f(x: B = ...) -> str: ...
def f(x): ...

@sterliakov
Copy link
Contributor

Ohh. I have to ask: why don't we produce "signature 2 will never be matched [overload-cannot-match]" there? It is indeed type-safe as the second overload is just ignored entirely, but also indicates a programming error.

I don't think [overloads-overlap] is appropriate here, not emitting it in your case is fine, and that becomes a separate [overload-cannot-match] bug, but I have a feeling that something is deeply wrong with overloads compatibility checking - perhaps it's time to rewrite that part...

@sterliakov
Copy link
Contributor

TBH, I think changes in this PR improve mypy - even though not all corner cases are handled, they weren't handled previously either, and the performance problem is rather dramatic. Could you add a testcase exercising this specific behaviour (many vars in overloads) to prevent introducing similar perf regressions in future?

@alexdrydew
Copy link
Contributor Author

alexdrydew commented Nov 19, 2024

Ohh. I have to ask: why don't we produce "signature 2 will never be matched [overload-cannot-match]" there? It is indeed type-safe as the second overload is just ignored entirely, but also indicates a programming error.

The problem is that [overload-cannot-match] check is done using an approximation where type variables are replaced with unions of all their respective bounds, so we can't detect errors which are true only for some specific variants of type variables. I guess to fix that one would need to rewrite this part to expand signatures by type variables as well or rewrite check_overlapping_overloads completely.

This comment has been minimized.

Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overloads in generic classes are extremely slow when using generic self
2 participants