Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: use package_distributions to get top levels #861

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

mkniewallner
Copy link
Collaborator

@mkniewallner mkniewallner commented Sep 14, 2024

PR Checklist

  • A description of the changes is added to the description of this PR.
  • If there is a related issue, make sure it is linked to this PR.
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added or modified a feature, documentation in docs is updated

Description of changes

Those changes are extracted from #854, as a first step towards using importlib.metadata/importlib_metadata to remove some custom code we have, and make "package" <-> "distribution" discovery more precise.

Python 3.10 introduced packages_distributions that returns a mapping that maps Python module names to the packages that expose them, by reading from top_level.txt like we do. In Python 3.11, the method became even smarter by falling back to reading from RECORD when no top_level.txt is present, also exactly like we do.

Updates in importlib.metadata are upstreamed from importlib_metadata, so by requiring >=4.13 on Python < 3.11, we could use packages_distributions in all Python versions we support, and also benefit from the improvement made on Python 3.11.

There is one functional change in behaviour with the new implementation, which IMO makes sense to have. Currently, if a package has a RECORD file, but we don't match any Python module name in the file, we don't use the fallback that assumes the module name is the name with _ instead of -. With the changes, we will now use the fall back if there was no match in RECORD.

Test plan

  • Existing functional tests are passing
  • Functional test added to ensure that package normalisation is applied
  • Added unit tests for the newly added functions

Copy link

codecov bot commented Sep 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.9%. Comparing base (6a43663) to head (9262b33).

Additional details and impacted files
@@           Coverage Diff           @@
##            main    #861     +/-   ##
=======================================
+ Coverage   92.7%   92.9%   +0.2%     
=======================================
  Files         37      39      +2     
  Lines        992     993      +1     
  Branches      98      98             
=======================================
+ Hits         920     923      +3     
+ Misses        56      55      -1     
+ Partials      16      15      -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mkniewallner mkniewallner force-pushed the chore/use-package-distributions branch 14 times, most recently from 9f741d0 to 0640431 Compare September 15, 2024 09:30
python/deptry/distribution.py Outdated Show resolved Hide resolved
python/deptry/distribution.py Outdated Show resolved Hide resolved
@mkniewallner mkniewallner force-pushed the chore/use-package-distributions branch 2 times, most recently from d3209f4 to 6b3700c Compare November 3, 2024 16:08
@mkniewallner
Copy link
Collaborator Author

Having some second thoughts on the new mechanism for falling back to guessing the module name from the package name that this PR would introduce. Several packages don't directly expose modules (for instance some CLIs). So trying to apply a fallback if the package made it clear that there is no exposed module at all does not really feel right in the end. There might be a way to keep the behaviour while being able to use package_distributions, but this might be a bit hacky.

This might be controversial, but should we really have a fallback anyway? Nowadays I believe that most packages do provide the metadata we need, and I personally see the fallback mechanism as too magical, knowing that it could even introduce some false positives in really specific cases. Additionally, users can still provide the mapping themselves if needed, for packages that don't have metadata at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants