Skip to content

Latest commit

 

History

History
139 lines (118 loc) · 8.14 KB

README.org

File metadata and controls

139 lines (118 loc) · 8.14 KB

This project isolates a given command and args using a container backend. Supported backends are bubblewrap, lxc and chroot. Latter is mainly thought for updating the underlying Linux system root.

A unique, fresh ZFS clone is used as dataset for each isolation, and discarded after use. Changes to the dataset are made on disk, preventing high memory usage under load. A privileged daemon cleans up used clones and provides fresh clones.

The project consists of the main script isolate, the common bindings common.sh and seccomp_wrapper.c, which complements bubblewrap. Latter works around bubblewraps --new-session option, which prevents feeding input from the isolated environment to the host terminal, but also prevents job control of the spawned shell.

isolate can work without ZFS. In this case, modifications to the backing Linux system root are persistent. For such setup, see “Setup without ZFS”

Setup

System preparations

Copy isolate and common.sh to /usr/local/bin We expect a group isolate to exist. Users that shall be able to use isolate shall be members of this group.

Creating the backing Linux system root

In the following, rpool/isolate/debisl is the dataset and /isolate/debisl its mountpoint. Both may be chosen arbitrarily.

Do the following steps as root user:

  1. Create a system Linux system root dataset root, e.g. zfs create rpool/isolate/debisl
  2. Create a template Linux system root using e.g. debootstrap /isolate/debisl
  3. [only bwrap] Compile the seccomp wrapper. You need the libseccomp static libs for this step.
    gentoo: install libseccomp, USE: static-libs
    debian: install libseccomp-dev
    
    gcc seccomp_wrapper.c -static -o seccomp_wrapper /usr/lib/x86_64-linux-gnu/libseccomp.a
        
  4. [only bwrap] Copy the seccomp wrapper into the root dataset such that it is in the PATH of the spawned isolated environment, e.g. /usr/bin/seccomp_wrapper
  5. Chroot into the environment and install what you need isolate -e chroot -t /isolate/debisl bash
  6. After you are done, create a snapshot
    zfs snapshot "rpool/isolate/debisl@$(date "+%Y%m%dT%H%M%SZ")"
        
  7. Generate clones for usage by isolate: isolate -r rpool/isolate/debisl 5

Your ZFS structure will look somewhat like this:

NAME                                             USED  AVAIL     REFER  MOUNTPOINT
rpool/isolate                                   18.8G   488G     9.96G  /isolate
rpool/isolate/deb_gnuradio                      1.86G   488G     1.72G  /isolate/deb_gnuradio
rpool/isolate/deb_gnuradio_1677838587626715157   184K   488G     1.72G  /isolate/deb_gnuradio_1677838587626715157
rpool/isolate/deb_gnuradio_1677838587665325261   184K   488G     1.72G  /isolate/deb_gnuradio_1677838587665325261
rpool/isolate/deb_gnuradio_1677838587703660046   184K   488G     1.72G  /isolate/deb_gnuradio_1677838587703660046
rpool/isolate/debisl                            6.96G   488G     6.39G  /isolate/debisl
rpool/isolate/debisl_1677860441807895636           8K   488G     6.39G  /isolate/debisl_1677860441807895636
rpool/isolate/debisl_1677860480418781581           8K   488G     6.39G  /isolate/debisl_1677860480418781581
rpool/isolate/debisl_1677860488091186587           8K   488G     6.39G  /isolate/debisl_1677860488091186587

Updating the backing Linux system root

Do the following steps as root user:

  1. Chroot into the environment and install what you need isolate -e chroot -t /isolate/debisl bash
  2. After you are done, create a snapshot
    zfs snapshot "rpool/isolate/debisl@$(date "+%Y%m%dT%H%M%SZ")"
        
  3. Generate clones for usage by isolate: isolate -f -r rpool/isolate/debisl 5

Enabling automatic clone regeneration

The path and system units we provide expect the isolation Linux system roots to be at rpool/isolate/xxx. If your paths are different, please modify the service and path files. Also, the number of clones is hardcoded in the unit file. You might want to adjust this too. A systemd path unit watches a signal file (touched by isolate on each exit) and triggers clone regeneration when an isolated environment exits and frees the used clone.

Copy [email protected] and [email protected] to /etc/systemd/system.

To enable regeneration for rpool/isolate/debisl, enable [email protected] through systemctl enable --now [email protected]

X11 and Pulseaudio passthrough

X11 access from the isolated environment is only supported with the bwrap engine, as is pulseaudio support. X11 should work without any configuration. For pulseaudio to work, you will have to allow anonymous access onto the pulse socket. Add the following to /etc/pulse/default.pa:

load-module module-native-protocol-unix auth-anonymous=1 socket=/tmp/pulse-socket

Setup without ZFS

isolate can function without ZFS by using template mode. Create a backing Linux system root in a directory of your choice. You can spawn isolate as usual, but rather using the -t instead of the -z option. Modifications to the underlying system root are of course permanent in template mode.

Executing isolated programs

See isolate -h:

spawn:                        ./isolate [-e engine] [-t templatedir] [-1] [-2] [-n] [-p pwd] [-d dir_host:target_ctr]... [-v env_name:env_value]... [command] [args] ...
spawn:                        ./isolate [-e engine] [-z zfs_template] [-1] [-2] [-n] [-p pwd] [-d dir_host:target_ctr]... [-v env_name:env_value]... [command] [args] ...
refresh available zfs clones: ./isolate [-r template] [-f] [-s] [num_should_avail]
  [!] refresh ALWAYS uses the latest snapshot on the dataset

OPTIONS
         -e: engine: one of {bwrap, chroot, lxc}, defaults to bwrap
         -d: dirs to mount into the sandbox, mounts dir_host to target_ctr in the container
         -u: uid [integer] to use as base uid. uid and the following 65535 uids are mapped to [0:65535] in the sandbox.
         -g: gid [integer] to use as base gid. gid and the following 65535 gids are mapped to [0:65535] in the sandbox.
             NOTE: when using -u or -g, you should align the ownership of the template to the range specified.
         -i: ignore SIGINT, keep running. Current the workaround until signal passing into the guest is implemented.
         -v: var:value to pass into the sandbox
         -p: pwd: switch to this directory on spawn. Defaults to /
         -1: bind X11 socket into guest
         -2: bind Pulseaudio socket into guest
         -n: share host network
         -x: trace
         -f: regenerate ALL templates of given zfs template
         -s: skip generating new clones
         -q: quiet: only print warnings and prompts

  add all users that should be able to use zfs features to the `isolate' group

ENVIRONMENT VARIABLES
   PRE_SPAWN_HOOK: command that is run before [command args] are run in the isolated environment
                   $ROOTFS references the root of the isolated environment to be started
   POST_SPAWN_HOOK: command that is run after [command args] has completed in the isolated environment
                   $ROOTFS references the root of the isolated environment to be started
   DISABLE_SECCOMP_WRAPPER [=!'']: disable the bwrap seccomp wrapper that prevents IOCTL to host
   LXC_NET_BR [=!'']: bridge to be used by LXC, defaults to br_vm
   LXC_MAP_TUN [!='']: map /dev/net/tun into the container
   LXC_CPU_QUOTA: percentage (1-100), this amount of CPU processing time will be available to the container. Default: 20
   LXC_MEM_MB_QUOTA: memory available to guest, in MB
   DEBUG_LXC_SPAWN [=!'']: if nonzero len, write lxc start log to /tmp/isolate_lxc_log

SUPPORTED BY ENGINE
  | engine | env | {U/G}ID | net | cmd+args | dirs | pwd | X11 | Pulseaudio |
  |--------+-----|---------------+----------+------|-----+-----+------------|
  | chroot |     |         |  X  | X        |      |     |     |            |
  | bwrap  |  X  |    *    |  X  | X        | X    | X   |  X  |     X      |
  | lxc    |  X  |    X    |  X  | X        | X    | X   |     |            |

  * this changes the uid in the container, which itself STILL RUNS AS UID 0!