Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determining cross-platform resolution strategy #13111

Open
1 task done
ofek opened this issue Dec 11, 2024 · 3 comments
Open
1 task done

Determining cross-platform resolution strategy #13111

ofek opened this issue Dec 11, 2024 · 3 comments
Labels
C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion type: feature request Request for a new feature

Comments

@ofek
Copy link
Contributor

ofek commented Dec 11, 2024

Description

I was attempting to fix this issue but after going deep in the code base it appears that pip (and packaging, at least currently) are fundamentally incapable of cross-platform resolution and any path forward would require a determination from maintainers as to the desired course of action.

As a basic example, let's say a user running on Windows wishes to output the wheels required to install a set of direct dependencies on Ubuntu 14.04. To accurately determine if a wheel encountered during resolution is supported one must know at least the entire set of marker values and the allowed platform tags. There are two ways of doing this:

  1. Use a lock file and assume pip is also the installer. - This entails saving the entire index resolution in a file, and coming up with a format for said file. At the point of installation pip would determine the allowed platform tags based on data taken from the system. This is the approach taken by basically every other tool like UV and Poetry.
  2. Require user-supplied resolution constraints. - This means coming up with a bespoke file format for storing an environment marker mapping, platform tag array, and potentially other data that is required. This is similar to the approach of Pex.

The expressed desired path in the linked issue was 2, but I don't think that is a good idea for a few reasons:

  • If there ever is support for lock files then there would be essentially two ways to achieve the same outcome, with the lock file approach being superior in terms of reproducibility.
  • More importantly, users supplying such constraint information would have a poor experience because if we are to prioritize correctness (as people expect from pip) then we couldn't take the approach of Pex and allow for the best-effort guessing of environment markers. Therefore users would have to save the exact environment of a target interpreter (currently by running a command like python -c "from packaging.markers import default_environment;print(default_environment())"). This, even if we provide a nice command, is an odd thing to require. Users would still have to learn the concept of platform tags and understand how to transform their actual constraints to platform tags, like the target system's version of glibc. This approach seems to be only useful for deployments where the target machine is known ahead of time, and further only really useful for a single target or else users would have to maintain multiple such constraint files. For deployments however, one would require/desire reproducibility and this would provide none. The only reason this works for Pex is because the wheels themselves are shipped as the end product.

Despite that approach being what I perceive to be the maintainers' preference, I think almost no one would use that in practice and it would be a wasted effort, especially when a lock approach is possible and would actually allow for the expected UX of --platform=... in a more predictable manner.

I'm curious to hear the thoughts of maintainers and whether they think that a better path would be to wait for Brett's lock file proposal.

Describe the solution you'd like

N/A

Alternative Solutions

Officially assert that cross-platform resolution will be unsupported

Additional context

Random notes:

  • My employer, Datadog, is donating engineering resources this quarter to various Python packaging improvements. One of my tasks was to implement cross-platform resolution capabilities by fixing this issue.
  • Functions in packaging such as this appear to offer support for defining the target platform but that is largely an illusion and does not work. For example, code paths attempt to query Linux-only functions which fail on Windows.
  • As an example of what others do, here is UV's function for determining platform tags based on the local target interpreter and here you can see them making a differentiation between target platforms based on, as an example, the version of manylinux (glibc compatibility).

Code of Conduct

@ofek ofek added S: needs triage Issues/PRs that need to be triaged type: feature request Request for a new feature labels Dec 11, 2024
@jsirois
Copy link
Contributor

jsirois commented Dec 11, 2024

@ofek as a point of reference, Pex uses off-the-shelf Pip with minimal runtime patches to achieve this; so although it's true Pip doesn't support cross-platform resolution today, it's not far off at all either.
Here are the patches:

@jsirois
Copy link
Contributor

jsirois commented Dec 11, 2024

And @ofek I think you misunderstand your approach 2. Pex both supports 3 target types (really 4) - which is what your point 2 addresses:

  1. LocalInterpreter
  2. AbbreviatedPlatform
  3. CompletePlatform
  4. Universal

And you can lock using any of these. In fact you can multi-lock using any combo of the 1st 3 or else produce a single universal lock. They are orthogonal concepts. The target is the target of a resolve or a lock, etc.

@pfmoore
Copy link
Member

pfmoore commented Dec 11, 2024

it appears that pip (and packaging, at least currently) are fundamentally incapable of cross-platform resolution and any path forward would require a determination from maintainers as to the desired course of action.

I think it would be fair to say that cross-platform installs were never a core goal for pip, and the --platform etc., flags were an incomplete attempt to add something without thinking through the implications. For example, pip has no way of reliably installing from a sdist that includes native extensions for a different platform - not least because build backends haven't come up with a standard way of supporting cross-compilations, so there's nothing for pip to work with.

Improving what pip can do would be worthwhile, but as was pointed out in #11664, this will likely involve either getting the user to specify more of the target environment's features (not ideal, it's messy enough already) or using some form of "user friendly specification to environment description" translation. Such a translation should be available across tools, so it should either go into packaging (if they were willing to accept it), or a 3rd party library we can vendor (if someone wanted to write it) or be defined as a standard. What I don't think we should do is try to put such platform determination logic into pip, as then other tools will have to reimplement it1 and we'll end up with discrepancies between tools.

If there ever is support for lock files then there would be essentially two ways to achieve the same outcome, with the lock file approach being superior in terms of reproducibility.

Agreed. But in my view, that says that pip should stop trying to do cross-platform resolves, and leave that to other tools that can tackle the various issues and generate a standards-conforming lockfile which pip can then install from. This is the ideal form of interoperability standards enabling specialised tools doing what they are best at.

There's still a UI issue here, though, as we can't avoid the problem of needing the user to specify the target platform in sufficient detail to do the resolution - it makes little difference whether the necessary marker evaluation is done by a simplified lockfile-install process, or by the full resolver.

Use a lock file and assume pip is also the installer.

I'm not sure what you mean by this. For a start, it means we'd be waiting for lockfiles to be standardised2, and the way that discussion is going, it's likely that cross-platform (or "multi scenario" in the terms being used in that discussion) lockfiles won't be part of that standard. Furthermore, I don't see pip as being a "locker" in terms of that standard - we'll install lockfiles created by other tools, but we won't create them ourselves (with one exception, see below). So all this does is push the problem onto other tools - which is fine by me, I guess, but doesn't seem like it's solving anything.

The only form of "saved resolution" pip could (or should, IMO) support is in terms of recording the result of a pip install run3. And that doesn't help with this problem, as we're talking here about how to improve what pip install does, so recording that after the fact isn't helpful. I suspect what you have in mind is some way of recording a partial resolution, leaving a later installer on the target platform to finalise based on the target environment. But that's exactly what a "multi-scenario locker" does, and it's not how pip's resolver works.

More importantly, users supplying such constraint information would have a poor experience because if we are to prioritize correctness (as people expect from pip) then we couldn't take the approach of Pex and allow for the best-effort guessing of environment markers.

Once again, I agree. The user experience is the key problem here. And it's hard, there's no doubt about that. But any means of cross-platform install will require marker evaluation (if you haven't already, go and read the lockfile PEP discussion for all the painful details on this!) so finding a good UI for letting the user define a target platform is going to be necessary however we want to tackle this. Pip's current approach with --platform, --implementation and --abi flags is flawed, possibly fatally, as it essentially requires the sort of "best-effort guessing" you want to avoid.

So we need something new, and as I've already said, that "something" should probably be in a separate library. That library could offer tools to serialise an environment's definition, store environment definitions with user defined aliases ("production-server", for example), guess a specification based on flags (guessing isn't bad as long as the user knows it's happening...), etc. And it could offer an API for clients to retrieve specs via a standard interface.

Or something else. I'm bad at UI design, so take the above with that in mind. But the core point, that this should be a public library, not a private function within pip, is the key.

I'm curious to hear the thoughts of maintainers and whether they think that a better path would be to wait for Brett's lock file proposal.

I don't think this problem (cross-platform resolution) is urgent, so I'm happy in general with "wait and see where the ecosystem is going". I'm not convinced that lockfiles will solve this issue - unless things take a surprising turn and we find a way to agree on multi-scenario lockfiles, the only gain we'll get from a standard lockfile is that users can (for example) use PDM's (tool-specific) cross-platform locking and then export a standard lockfile (again within PDM) for the target environment that pip could install using pip install --lockfile=pyproject.lock --target=./lib.

I think the biggest issue is working out a UI for specifying an interpreter/platform. That is work that will have to be done regardless of what tool the user prefers when doing the cross-platform resolve. So if you have funded time to work on cross-platform issues, I think it would be best spent developing a library that addresses that side of the problem.

Footnotes

  1. Remember, pip isn't a library and has no API.

  2. There's no way I want pip to get into the current "invent your own tool-specific lockfile" game 🙁

  3. This is what I was referring to as the one exception to the statement "pip is not a locker" - the report from pip install --dry-run --report should contain sufficient information to generate a single-scenario lockfile, and a 3rd party tool could produce a lockfile from that. Or pip could grow a --report-format=lockfile option. But this inherits all of the problems of the cross-platform behaviour of pip install, so it's not a solution to those problems, just an example of them.

@ichard26 ichard26 added C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion and removed S: needs triage Issues/PRs that need to be triaged labels Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

4 participants