Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot set multi-platform bake contexts if platforms do not match exactly #2486

Open
3 tasks done
PigeonF opened this issue May 30, 2024 · 10 comments · May be fixed by #2877
Open
3 tasks done

Cannot set multi-platform bake contexts if platforms do not match exactly #2486

PigeonF opened this issue May 30, 2024 · 10 comments · May be fixed by #2877

Comments

@PigeonF
Copy link

PigeonF commented May 30, 2024

Contributing guidelines

I've found a bug and checked that ...

  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

When using a target: directive as a bakefile context in a multi-platform scenario, the platforms of the targets have to match exactly or the build fails.

Expected behaviour

The build should succeed if the referencing target is built for a subset of the platforms of the referenced target.

Actual behaviour

The build fails unless the platforms match exactly.

Buildx version

github.com/docker/buildx v0.14.0

Docker info

Client:
 Version:    24.0.9
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0
    Path:     /nix/store/jidnm42865p7pisj8i7nils91ianj19f-docker-plugins/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.27.0
    Path:     /nix/store/jidnm42865p7pisj8i7nils91ianj19f-docker-plugins/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 16
 Server Version: 24.0.9
 Storage Driver: overlayfs
  driver-type: io.containerd.snapshotter.v1
 Logging Driver: journald
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: v1.7.16
 runc version:
 init version:
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.6.32
 Operating System: NixOS 24.05 (Uakari)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 23.37GiB
 Name: geonosis
 ID: 1bde965b-61b0-4cbd-94e0-91a66e0c1109
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  registry.internal
  cache.internal
  127.0.0.0/8
 Registry Mirrors:
  http://cache.internal/
 Live Restore Enabled: true

Builders list

NAME/NODE     DRIVER/ENDPOINT                    STATUS    BUILDKIT               PLATFORMS
remote*       remote
 \_ remote0    \_ tcp://buildkit.internal:3375   running   4cf5e34                linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386
default       docker
 \_ default    \_ default                        running   v0.11.7+435cb77e369c   linux/amd64, linux/amd64/v2, linux/amd64/v3

Configuration

target "default" {
  dockerfile-inline = "FROM other"
  contexts = {
    other = "target:other"
  }
  platforms = [
    "linux/amd64",
    "linux/arm64",
  ]
}

target "other" {
  dockerfile-inline = "FROM docker.io/library/alpine:3"
  platforms = [
    "linux/amd64",
    "linux/arm64",
    "linux/arm/v7",
  ]
}

Build logs

#0 building with "remote" instance using remote driver

#1 [internal] load local bake definitions
#1 reading docker-bake.hcl 314B / 314B done
#1 DONE 0.0s
ERROR: target other can't be used by default because it is defined for different platforms [linux/amd64 linux/arm64 linux/arm/v7] and [linux/amd64 linux/arm64]

Additional info

No response

@crazy-max
Copy link
Member

crazy-max commented May 30, 2024

remote*       remote
 \_ remote0    \_ tcp://buildkit.internal:3375   running   4cf5e34                linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386

Seems like your machine does not support arm architectures which is why your build fails with either remote or default builders. You can either append a native arm node to remote builder or setup QEMU on your host as shown in https://docs.docker.com/build/building/multi-platform/#qemu-without-docker-desktop

@PigeonF
Copy link
Author

PigeonF commented May 30, 2024

The issue is not that my machine does not support ARM (I can docker buildx bake other fine), just that other cannot be used as the reference. That is, if I can docker buildx bake other I would expect docker buildx bake default to work as well.

You can check this locally by changing the platforms for mid in

buildx/bake/bake_test.go

Lines 878 to 881 in 9ad116a

target "mid" {
output = ["foo"]
platforms = ["linux/amd64", "linux/arm64"]
}

to platforms = ["linux/amd64", "linux/arm64", "linux/arm/v7"]

@crazy-max
Copy link
Member

crazy-max commented May 30, 2024

Oh my bad I can't read 🙈

So yes platforms need to perfectly match otherwise it fails with this error because there is no proper linking between platforms in your named context. Does it work with the following definition?:

target "default" {
  dockerfile-inline = "FROM other"
  contexts = {
    other = "target:other"
  }
  platforms = [
    "linux/amd64",
    "linux/arm64",
    "linux/arm/v7"
  ]
}

target "other" {
  dockerfile-inline = "FROM docker.io/library/alpine:3"
  platforms = [
    "linux/amd64",
    "linux/arm64",
    "linux/arm/v7"
  ]
}

Maybe if a named context does not match platforms of caller, we could override them.

@PigeonF
Copy link
Author

PigeonF commented May 30, 2024

Yes, the snippet you link works. The issue is that I want to build two separate targets, one of which supports more platforms than the other 😅 .

I think changing

buildx/bake/bake.go

Lines 501 to 505 in 9ad116a

if len(t.Platforms) > 1 && len(t2.Platforms) > 1 {
if !sliceEqual(t.Platforms, t2.Platforms) {
return errors.Errorf("target %s can't be used by %s because it is defined for different platforms %v and %v", target, name, t2.Platforms, t.Platforms)
}
}

to check if t.Platforms is a subset of t2.Platforms instead of equality should suffice? Or is that what you mean with there not being proper linking between named context and platforms (i.e. this check would pass, but the build would fail down the line because of some platform mismatch)?

@crazy-max
Copy link
Member

to check if t.Platforms is a subset of t2.Platforms instead of equality should suffice?

Yeah maybe a subset would be enough but wonder if should not override with platforms from caller instead per my comment #2486 (comment). WDYT @tonistiigi?

@saracen
Copy link

saracen commented Dec 19, 2024

to check if t.Platforms is a subset of t2.Platforms instead of equality should suffice?

Yeah maybe a subset would be enough but wonder if should not override with platforms from caller instead per my comment #2486 (comment). WDYT @tonistiigi?

@crazy-max @tonistiigi @PigeonF I recently stumbled across this problem and I think a subset, rather than overriding, is what I'd expect.

If I referenced an image here rather than a target, and the platform wasn't found, it would return an error. I've been treating target:something effectively like an unpublished image, so I'd expect similar behaviour.

However, whilst that's what I'd expect to happen, I don't know whether that's a good enough reason to not just override.

I'm hoping my thoughts here at least help us choose a direction. Do they? 😄 I'd really like to contribute the fix when we know the path forward.

@tonistiigi
Copy link
Member

I think subset-based validation should work fine.

@saracen
Copy link

saracen commented Dec 19, 2024

@tonistiigi

It's kind of odd that the check only comes into play with len(t.Platforms) > 1 && len(t2.Platforms) > 1.

So if the target or referenced target only have the one platform each, they can completely differ from each other. I suspect originally this was suppose to use len(t.Platforms) > 0 && len(t2.Platforms) > 0.

Changing that now would probably be a breaking change. It also means that as long as you only provided one platform, the referenced target platform never mattered before. Due to this, maybe it should override, rather than be a subset? It feels odd to enforce a subset only when there's 2 or more platforms.

@saracen
Copy link

saracen commented Dec 19, 2024

Apologies for the noise.

the referenced target platform never mattered before

It turns out it does matter.

target "ubuntu" {
  contexts = {
    ubuntu = "docker-image://ubuntu:20.04"
  }

  platforms = ["linux/amd64"]
}

target "ubuntu-other" {
  contexts = {
    ubuntu = "target:ubuntu"
  }

  platforms = ["linux/arm64"]
}

This passes validation, but will only create an amd64 image. The target being a subset does matter, it's just that the current validation isn't correct.

@tonistiigi
Copy link
Member

That > 1 comes from the (historic) behavior that if the image is not a multi-platform image, then it can be included in any context (e.g. FROM wrongarchimage), while if the name points to a multi-platform image, then you do need to match one of the platforms.

In a more practical sense, an example of a valid use case on mismatching platforms is cross-compilation. The target you are importing may be always used for your native architecture, while the caller target may be multi-platform. For example https://github.com/tonistiigi/xx/blob/master/docker-bake.hcl#L189 does that.

From API standpoint, named contexts in BuildKit can have a platform identifier or may not. If they don't then they match against any platform. For example, this is how platform matching works internally and how bake uses to map them.

from foo
run uname -m  && cat /etc/alpine-release || true
buildctl build --opt context:foo::linux/amd64=docker-image://busybox --opt context:foo::linux/arm64=docker-image://alpine --opt platform=linux/amd64,linux/arm64 --local dockerfile=. --progress=plain --frontend=dockerfile.v0

...
#7 [linux/arm64 1/2] RUN uname -m  && cat /etc/alpine-release || true
#7 0.074 aarch64
#7 0.074 3.21.0
#7 DONE 0.1s

#8 [linux/amd64 1/2] RUN uname -m  && cat /etc/alpine-release || true
#8 0.083 x86_64
#8 0.092 cat: can't open '/etc/alpine-release': No such file or directory
#8 DONE 0.1s

In a normal case you would just do --opt context:foo=docker-image://alpine and then same target would match any platform.

@saracen saracen linked a pull request Dec 20, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants