Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not ignore generic type args when checking multiple inheritance compatibility #18270

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

sterliakov
Copy link
Contributor

@sterliakov sterliakov commented Dec 9, 2024

Fixes #18268. Fixes #9319. Fixes #14279. Fixes #9031. Resolves "another example" in #6184 (which is not another example in fact but a different issue - overriding in class matters.

Currently mypy traverses the MRO of a class in case of multiple inheritance, ignoring the provided type arguments. Ths causes both false positives (see #18268) and false negatives (see e.g. testSubtypingIterableUnpacking2: class X(Iterator[U], Mapping[T, U])).

I propose walking over a similar structure: a "typed MRO". It's a regular MRO where every instance is replaced with (one or more) Instances with associated typevars, one per each base having that type in its MRO. Traversing over such allows us to know when the same typevar is substituted in multiple bases.

This PR also removes incorrect cross-checks for 2nd, 3rd and so on MRO entries when the first entry contributed same name compatible with all further entries (#14279).

This comment has been minimized.

@sterliakov sterliakov marked this pull request as draft December 9, 2024 04:09

This comment has been minimized.

mypy/checker.py Outdated Show resolved Hide resolved

This comment has been minimized.

This comment has been minimized.

@sterliakov
Copy link
Contributor Author

  • xarray: was a false positive, Incompatible definition of generic field on multiple base classes #9031
  • pymongo: don't see what's wrong there, but looks unrelated?
  • steam: weird and unsafe, done for runtime hints usage.
  • ibis: good (inherits class FrozenDict(dict, Mapping[K, V]) - dict is missing type params, so should be compatible)
  • homeassistant: very long hierarchy, but looks correct: the default for the omitted typevar is DataUpdateCoordinator[dict[str, Any]], coordinator type args are incompatible
  • kornia: correct, silencing transitive errors

@sterliakov sterliakov marked this pull request as ready for review December 10, 2024 00:24

This comment has been minimized.

mypy/checker.py Outdated
# This detects e.g. `class A(Mapping[int, str], Iterable[str])` correctly.
# For each MRO entry, include it parametrized according to each base inheriting
# from it.
typed_mro = [

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better, but I am still thinking, that this typed_mro should be builded at MRO calculation phase, not in checker.
This logic should be used shared between all class checks, not compatibility-only, so it will be the code to copy between them, I afraid

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, we can replace regular MRO to typed one at all? In could be breaking change for some checks, but the idea looks generally correct for me

Copy link

@Lancetnik Lancetnik Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does it deliver types to parent correctly in case, where we specify them not in the final inheritor?

Like in the follow

class A[T]:
    # T = str

class B[T2](A[str]):
     # T2 = int

class C(B[int]):
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be builded at MRO calculation phase, not in checker.

Makes sense to me, but I'd rather defer this to a follow-up PR. There are several more places (override checks) where a "typed MRO" would be handy, we should adjust them one by one - that'd make this PR a lot harder to review.

Probably, we can replace regular MRO to typed one at all?

Those are of different types. Fundamentally, TypeInfo and Instance are different entities that shouldn't be mixed up. It makes sense to me as a general idea, but now it would result in a very huge changeset I won't be able to handle correctly (and no one will enjoy reviewing). `mypy

does it deliver types to parent correctly

Hope so, map_instance_to_supertype looks pretty robust. If I understand your snippet correctly, it's similar to testMultipleInheritanceCompatibleTypeVar - generics are mapped there as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building that list is really fast, so there should be very little perf impact, if any. I'm not sure that it makes sense to cache it at all. This "typed MRO" is a property of an Instance, not of a TypeInfo, so we'll build it for every Instance we have, but only use on a few occasions. It's actually important to build it parametrized precisely by last known arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if property is good (current codebase seems to follow "only trivial computations for properties" rule). A method would be more reasonable, but the problem is in dependencies between those files (note that mypy.nodes never imports mypy.types - the dependency is inverted, and that was done for architectural reasons). TBH I think this should be a method of TypeChecker - does that sound good enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding other places - I have a strong feeling that other override-related issues would benefit from using typed MRO explicitly.

However, I looked through the issue tracker now, and seems like almost everything else is fine - the only problem I see right now is #6184 which can indeed be solved using "typed MRO" here:

mypy/mypy/checker.py

Lines 1983 to 1993 in 1497407

found_method_base_classes: list[TypeInfo] = []
for base in defn.info.mro[1:]:
result = self.check_method_or_accessor_override_for_base(
defn, base, check_override_compatibility
)
if result is None:
# Node was deferred, we will have another attempt later.
return None
if result:
found_method_base_classes.append(base)
return found_method_base_classes

But you can just search for .mro[1:] in checker.py, some usages of it may be problematic. check_compatibility_all_supers should also be using typed MRO if I understand it correctly. Any final checks, OTOH, don't need that and work with plain MRO just as good.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I think this should be a method of TypeChecker - does that sound good enough?

LGTM 👍🏿

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to take care about #6184 if you are okay with it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review! I extracted a helper method (and also stopped duplicating non-generic bases in typed MRO) and borrowed a testcase.

Sure, go ahead with that ticket, that would be great! Please also try to search the issue tracker, I might have missed other reports related to that one.

mypy/checker.py Outdated
# For each MRO entry, include it parametrized according to each base inheriting
# from it.
typed_mro = [
map_instance_to_supertype(base, parent)
Copy link

@Lancetnik Lancetnik Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still confused by this line. Seems like we don't propogate real types to whole tree. We are using just a current class bases, so it propogates options to class+2 level maximum

class A[T]: ... 
class B[T](A[T]): ...  # <- str propagated here only
class C(B[str]): ...
class D(C): ...

I am afraid, that A doesn't know about str in D class analyzing.
I think, we should build the tree recursively - it was implemented in my PR

Copy link

@Lancetnik Lancetnik Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does map_instance_to_supertype(C(B[str]), A[T]) works?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case could be even harder:

class A[T]: ... 
class B(A[str]): ...
class C(B): ...
class D(C): ...

Does map_instance_to_supertype(C, A[T]) works as expected?

Copy link
Contributor Author

@sterliakov sterliakov Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the first case typed_mro for class Foo(D, int) is

[__main__.D, __main__.C, __main__.B[builtins.str], __main__.A[builtins.str], builtins.int, builtins.object, builtins.object]

In the second case it is (also for class Foo(D, int)):

[__main__.D, __main__.C, __main__.B, __main__.A[builtins.str], builtins.int, builtins.object, builtins.object]

And the following raises a diagnostic as expected:

from typing import Generic, TypeVar

T = TypeVar("T")
class A(Generic[T]):
    foo: T
class B(A[str]): ...
class C(B): ...
class D(C): ...
class Foo(D, A[T]): ...  # E: Definition of "foo" in base class "A" is incompatible with definition in base class "A"

(map_instance_to_supertype lives in maptype.py, and logic there seems to cover all such propagations - this is not the most efficient approach compared to building "typed MRO" recursively, but much easier one since the necessary parts are already written and validated)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I am new in mypy 😄 and don't now, how map_instance_to_supertype works exactly - thanks for clarify.
Seems, like everything is fine and PR is ready to be merged

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! It looks like a reasonable approach, however, I would leave the final decision to @JukkaL or @ilevkivskyi :)

Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

xarray (https://github.com/pydata/xarray)
+ xarray/core/groupby.py:1610: error: Unused "type: ignore" comment  [unused-ignore]
+ xarray/core/groupby.py:1778: error: Unused "type: ignore" comment  [unused-ignore]
+ xarray/core/resample.py:236: error: Unused "type: ignore" comment  [unused-ignore]
+ xarray/core/resample.py:379: error: Unused "type: ignore" comment  [unused-ignore]

mongo-python-driver (https://github.com/mongodb/mongo-python-driver)
+ pymongo/helpers_shared.py:131: error: Unused "type: ignore" comment  [unused-ignore]

steam.py (https://github.com/Gobot1234/steam.py)
+ steam/ext/commands/converters.py:273: error: Definition of "converter_for" in base class "Converter" is incompatible with definition in base class "Converter"  [misc]
+ steam/ext/commands/converters.py:294: error: Definition of "converter_for" in base class "Converter" is incompatible with definition in base class "Converter"  [misc]
+ steam/ext/commands/converters.py:302: error: Definition of "converter_for" in base class "Converter" is incompatible with definition in base class "Converter"  [misc]
+ steam/ext/commands/converters.py:389: error: Definition of "converter_for" in base class "Converter" is incompatible with definition in base class "Converter"  [misc]

ibis (https://github.com/ibis-project/ibis)
- ibis/common/collections.py:280: error: Definition of "get" in base class "Mapping" is incompatible with definition in base class "Mapping"  [misc]

core (https://github.com/home-assistant/core)
+ homeassistant/components/ring/entity.py:176: error: Definition of "coordinator" in base class "BaseCoordinatorEntity" is incompatible with definition in base class "BaseCoordinatorEntity"  [misc]
+ homeassistant/components/tesla_fleet/entity.py:157: error: Definition of "coordinator" in base class "BaseCoordinatorEntity" is incompatible with definition in base class "BaseCoordinatorEntity"  [misc]
+ homeassistant/components/tomorrowio/weather.py:97: error: Definition of "coordinator" in base class "BaseCoordinatorEntity" is incompatible with definition in base class "BaseCoordinatorEntity"  [misc]

kornia (https://github.com/kornia/kornia)
+ kornia/augmentation/_3d/mix/transplantation.py:7: error: Unused "type: ignore" comment  [unused-ignore]

@ilevkivskyi
Copy link
Member

Yes, I am very much interested in this topic (i.e. issues similar to #6184). I vaguely remember that last time I thought about this it seemed the best way to address all this is check derivation paths compatibility (see map_instance_to_supertypes()). I will try to take a proper look on weekend (or maybe early next week).

@sterliakov
Copy link
Contributor Author

@ilevkivskyi just a gentle ping if you forgot about this:)

@ilevkivskyi
Copy link
Member

Hi! No, I didn't forget to look at it, but just finally found energy to post my findings. Looking at the tests and issues closed, it seems like you are trying to solve three separate problems with multiple inheritance in this PR:

  1. Only the most derived definition of an attribute should be checked to be a subtype w.r.t. to definitions in further MRO entries, not each definitions pair in the MRO.
  2. Map/expand step is missing when comparing attributes with non-callable types.
  3. Mypy has various inconsistent behaviours when a generic class (say B) comes to the MRO from two bases with different parameters (say B[int] and B[str]).

While first two are just simple obvious bugs, the last one is actually a non-trivial thing and any attempts towards it should be excluded from this PR. Now my general comments on the first two issues:

  1. I think you can simplify your implementation and make the logic more obvious, if instead of using seen_names (which hints at some kind of caching or similar) you will:
    • have one (outer) loop over attribute name
    • find the most derived definition for given attribute, if it is in current class, skip this attribute
    • in the second (inner) loop go over classes that appear after the index where you found most derive definition
  2. I think you can simply add missing map_type_from_supertype() call in the non-callable branch (the one that starts with elif first_type and second_type:), see bind_and_map_method() for how to call it. All the "typed MRO" stuff is not needed for this (and I am not even sure it is correct).

After you will update the PR accordingly, I will look again and leave more detailed comments (if needed).

@sterliakov
Copy link
Contributor Author

sterliakov commented Dec 17, 2024

Thank you for this thorough review, @ilevkivskyi!

I must confess I hardly understand some of your points. This PR is trying to solve one primary problem. The problem is: when checking compatibility of parents in a multiple inheritance scenario, mypy fails to understand the relationship between generics in those. This issue is related to your problem 3, but doesn't directly correspond to it.

I agree that problem 1 is solved here (seen_names check).

On the other hand, problem 2 just ceases to exist in the proposed implementation. I'm reasonably confident that simplifying some fragile checks by obtaining already parametrized callables from corresponding base classes is somewhat more obviously correct.

#18268 and #9031 share a common source: mypy considers base TypeInfos (MRO entries) instead of instances with appropriately mapped generics. "Typed MRO" approach I propose fixes exactly that: we just build an MRO-like structure with all generics propagated from current definition. #14279 and #9319 are indeed both caused by the problem 1 from your list.

So yes, now there are two problems solved here: generics mapping and avoiding cross-checks for each pair of MRO entries. If you'd prefer, I can submit the second part as a separate PR, leaving only the "typed MRO" here and removing seen_names check.

Could you explain a bit further your concerns regarding using this "typed MRO" structure? It looks like a very obvious solution to me, so I would really prefer to stick to it unless you see some hidden deficiencies I'm missing.

Regarding your specific suggestions, no. 1 turned out to be more complicated and less efficient (because we have to build a list of all attributes first, and then traverse the MRO or typed MRO again to find the source of that name). seen_names contains all attributes that were already checked against - if any of preceding bases defines that name, all following bases don't have to define it compatibly with each other). No. 2 assumes (AFAIC) that we ditch the "typed MRO" which I'd prefer to keep.

@ilevkivskyi
Copy link
Member

The problem is: when checking compatibility of parents in a multiple inheritance scenario, mypy fails to understand the relationship between generics in those.

The real problem is that this statement is factually wrong. There is nothing fundamental missing in mypy. As I already explained, mypy, as written now, only applies the correct map/expand steps when comparing attribute types only when those types are callable types. For example, your test testMultipleInheritanceCompatibleTypeVar already passes on master (modulo error message order) if I replace x: T with x: Callable[[], T] everywhere.

No. 2 assumes (AFAIC) that we ditch the "typed MRO" which I'd prefer to keep.

Yes, we don't need all that. Because in best case it would be re-inventing the wheel (and I don't think it is even correct as written). This whole generic handling problem can be fixed with just a two-line diff:

         elif first_type and second_type:
             if isinstance(first.node, Var):
+                first_type = map_type_from_supertype(first_type, ctx, base1)
                 first_type = expand_self_type(first.node, first_type, fill_typevars(ctx))
             if isinstance(second.node, Var):
+                second_type = map_type_from_supertype(second_type, ctx, base2)
                 second_type = expand_self_type(second.node, second_type, fill_typevars(ctx))
             ok = is_equivalent(first_type, second_type)

Regarding your specific suggestions, no. 1 turned out to be more complicated and less efficient

I doubt there will be any measurable difference. Also premature optimization is the root of all evil. To be clear, I mean something as simple as this:

        all_names = {name for base in mro for name in base.names}
        for name in all_names - typ.names.keys():
            if is_private(name):
                continue
            i, base = next((i, base) for i, base in enumerate(mro) if name in base.names)
            for base2 in mro[i + 1 :]:
                if name in base2.names and base2 not in base.mro:
                    self.check_compatibility(name, base, base2, typ)

which is a double-nested loop instead of a triple-nested loop, and IMO reads self-explanatory.

@sterliakov
Copy link
Contributor Author

sterliakov commented Dec 19, 2024

Wow, your first paragraph was the missing bit, thanks! I get your point now.

This approach is similar to my very first attempt that didn't make it into this PR. I replaced it with current approach for the following reason:

from collections.abc import Iterable, Mapping
from typing import TypeVar

_T = TypeVar("_T")
_U = TypeVar("_U")

class Bad(Mapping[_T, int], Iterable[_U]):
    pass

This snippet should obviously (? perhaps I'm misunderstanding something again?) generate a diagnostic as _T and _U are unrelated, and __iter__ isn't overridden. This snippet is accepted by current mypy master and can't be rejected by any algorithm relying on MRO alone - Iterable is present in MRO only once, and so will be parameterized with the same type. __iter__ is only defined in Iterable, so won't be checked as a Mapping attribute, and then we won't attempt to compare Iterable[_T] vs Iterable[_U] since both map to a single Iterable TypeInfo in the MRO.

There's an old test case testSubtypingMappingUnpacking3 commenting that behaviour, but no explanation is provided there. That case was added in #11151. This applies to any scenario when impossible parameter combination is requested for derived and parent class, both included in bases.

To be clear, I mean something as simple as this:

And that's nice, thanks! This does read better indeed. And it's true that is_subtype was an overkill here, MRO check should suffice (not MRO exactly - only "typed MRO" again, otherwise incompatible type parameters won't be detected again).

Wrapping up

To avoid any heated discussions here, I'm ready to close this PR and open a replacement according to your suggestions above, but I still believe that existing MRO-based checks are inferior to traversing the "typed MRO" (essentially a copy of MRO with generic entries duplicated with different params from their corresponding bases) as they fail to detect a typing error by design.

I do trust your maintainer's knowledge and expertise and don't question them, but sometimes being focused on a problem for a long time can make one's judgement slightly biased. My code is not my child, I don't mind iterating and throwing away ideas proven unsuccessful, and I'm really committed to improving mypy in whatever way I can. Please consider asking someone else to compare both of the proposed solutions if possible, or at least coming back to my proposal in a day or two with a fresh pair of eyes. Thank you for your time! Please ping here if you insist on proceeding with your proposal, I'll open a superseding PR with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants