Exploring nspawn for CIs

pdo debian eng sw systemd

This post is part of a series about trying to setup a gitlab runner based on systemd-nspawn. I published the polished result as nspawn-runner on GitHub.

Here I try to figure out possible ways of invoking nspawn for the prepare, run, and cleanup steps of gitlab custom runners. The results might be useful invocations beyond Gitlab's scope of application.

I begin with a chroot which will be the base for our build environments:

debootstrap --variant=minbase --include=git,build-essential buster workdir

Fully ephemeral nspawn

This would be fantastic: set up a reusable chroot, mount readonly, run the CI in a working directory mounted on tmpfs. It sets up quickly, it cleans up after itself, and it would make prepare and cleanup noops:

mkdir workdir/var/lib/gitlab-runner
systemd-nspawn --read-only --directory workdir --tmpfs /var/lib/gitlab-runner "$@"

However, run gets run multiple times, so I need the side effects of run to persist inside the chroot between runs.

Also, if the CI uses a large amount of disk space, tmpfs may get into trouble.

nspawn with overlay

Federico used --overlay to keep the base chroot readonly while allowing persistent writes on a temporary directory on the filesystem.

Note that using --overlay requires systemd and systemd-container from buster-backports because of systemd bug #3847.

Example:

mkdir -p tmp-overlay
systemd-nspawn --quiet -D workdir \
  --overlay="`pwd`/workdir:`pwd`/tmp-overlay:/"

I can run this twice, and changes in the file system will persist between systemd-nspawn executions. Great! However, any process will be killed at the end of each execution.

machinectl

I can give a name to systemd-nspawn invocations using --machine, and it allows me to run multiple commands during the machine lifespan using machinectl and systemd-run.

In theory machinectl can also fully manage chroots and disk images in /var/lib/machines, but I haven't found a way with machinectl to start multiple machines sharing the same underlying chroot.

It's ok, though: I managed to do that with systemd-nspawn invocations.

I can use the --machine=name argument to systemd-nspawn to make it visible to machinectl. I can use the --boot argument to systemd-nspawn to start enough infrastructure inside the container to allow machinectl to interact with it.

This gives me any number of persistent and named running systems, that share the same underlying chroot, and can cleanup after themselves. I can run commands in any of those systems as I like, and their side effects persist until a system is stopped.

The chroot needs systemd and dbus for machinectl to be able to interact with it:

debootstrap --variant=minbase --include=git,systemd,systemd,build-essential buster workdir

Let's boot the machine:

mkdir -p overlay
systemd-nspawn --quiet -D workdir \
    --overlay="`pwd`/workdir:`pwd`/overlay:/"
    --machine=test --boot

Let's try machinectl:

# machinectl list
MACHINE CLASS     SERVICE        OS     VERSION ADDRESSES
test    container systemd-nspawn debian 10      -

1 machines listed.
# machinectl shell --quiet test /bin/ls -la /
total 60
[…]

To run commands, rather than machinectl shell, I need to use systemd-run --wait --pipe --machine=name, otherwise machined won't forward the exit code. The result however is pretty good, with working stdin/stdout/stderr redirection and forwarded exit code.

Good, I'm getting somewhere.

The terminal where I ran systemd-nspawn is currently showing a nice getty for the booted system, which is cute, and not what I want for the setup process of a CI.

Spawning machines without needing a terminal

machinectl uses /lib/systemd/system/systemd-nspawn@.service to start machines. I suppose there's limited magic in there: start systemd-nspawn as a service, use --machine to give it a name, and machinectl manages it as if it started it itself.

What if, instead of installing a unit file for each CI run, I try to do the same thing with systemd-run?

systemd-run \
  -p 'KillMode=mixed' \
  -p 'Type=notify' \
  -p 'RestartForceExitStatus=133' \
  -p 'SuccessExitStatus=133' \
  -p 'Slice=machine.slice' \
  -p 'Delegate=yes' \
  -p 'TasksMax=16384' \
  -p 'WatchdogSec=3min' \
  systemd-nspawn --quiet -D `pwd`/workdir \
    --overlay="`pwd`/workdir:`pwd`/overlay:/"
    --machine=test --boot

It works! I can interact with it using machinectl, and fine tune DevicePolicy as needed to lock CI machines down.

This setup has a race condition where if I try to run a command inside the machine in the short time window before the machine has finished booting, it fails:

# systemd-run […] systemd-nspawn […] ; machinectl --quiet shell test /bin/ls -la /
Failed to get shell PTY: Protocol error
# machinectl shell test /bin/ls -la /
Connected to machine test. Press ^] three times within 1s to exit session.
total 60
[…]

systemd-nspawn has the option --notify-ready=yes that solves exactly this problem:

# systemd-run […] systemd-nspawn […] --notify-ready=yes ; machinectl --quiet shell test /bin/ls -la /
Running as unit: run-r5a405754f3b740158b3d9dd5e14ff611.service
total 60
[…]

On nspawn's side, I should now have all I need.

Next steps

My next step will be wrapping it all together in a gitlab runner.