Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility issue between QEMU and Linux kernel cause buildx failing #2841

Open
3 tasks done
baizhenyu opened this issue Dec 4, 2024 · 1 comment
Open
3 tasks done

Comments

@baizhenyu
Copy link

Contributing guidelines

I've found a bug and checked that ...

  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

Our team has been using buildx to build multi-arch fluent-bit image long time ago. However, the build for ARM64 image on debian:bullseye started to failing two months ago with following error:

#39 249.8 ===============================================================================
#39 250.9 [ 27%] Performing build step for 'jemalloc'
#39 251.5 gcc: internal compiler error: Segmentation fault signal terminated program cc1
#39 251.5 Please submit a full bug report,
#39 251.5 with preprocessed source if appropriate.

After a lot of troubleshooting steps, we noticed there is some compatibility issue between QEMU and Debian kernel 5.10.0-33-cloud-amd64/5.10.0-33-debian-amd64. We used following approach to setup QEMU for buildx:

sudo docker run --privileged --rm tonistiigi/binfmt:qemu --install all
sudo docker buildx create --name builder --use
sudo docker buildx inspect --bootstrap

And we tried different version of QEMU including 6.2, 7.0, 8.2 and 9.2.1 (latest) but none of them works with this kernel. As soon as I downgrade the kernel version to 5.10.0-32-cloud-amd64, the build starts to work again.

Since changing kernel version, native build and cross-compiling are not options for our CI//CD pipeline, we are wondering how can we move forward to address this problem.

Expected behaviour

The build succeeds.

Actual behaviour

The build failed with internal compiler error.

Buildx version

github.com/docker/buildx v0.17.1 257815a

Docker info

Client: Docker Engine - Community
 Version:    27.3.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.17.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.7
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 5
 Server Version: 27.3.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 88bf19b2105c8b17560993bee28a01ddc2f97182
 runc version: v1.2.2-0-g7cb3632
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.10.0-33-cloud-amd64
 Operating System: Debian GNU/Linux 11 (bullseye)
 OSType: linux
 Architecture: x86_64
 CPUs: 32
 Total Memory: 31.35GiB
 Name: instance-20241130-035110
 ID: e00475b0-25b9-473b-a839-38acdcf7cb77
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Builders list

NAME/NODE      DRIVER/ENDPOINT                   STATUS    BUILDKIT   PLATFORMS
builder*       docker-container
 \_ builder0    \_ unix:///var/run/docker.sock   running   v0.17.2    linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386
default        docker
 \_ default     \_ default                       running   v0.16.0    linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386

Configuration

https://github.com/fluent/fluent-bit/blob/master/dockerfiles/Dockerfile

docker buildx build --platform=linux/arm64 -f ./dockerfiles/Dockerfile .

Build logs


Additional info

No response

@tonistiigi
Copy link
Member

Defining ENV QEMU_STRACE=1 will show you trace of syscalls proxied by the emulator and may point to potential issue. If that works you can try to submit your findings to qemu upstream tracker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants