This is part of a series of posts on ideas for an ansible-like provisioning system, implemented in Transilience.
Unit testing some parts of Transilience, like the apt and systemd actions, or remote Mitogen connections, can really use a containerized system for testing.
To have that, I reused my work on nspawn-runner. to build a simple and very fast system of ephemeral containers, with minimal dependencies, based on systemd-nspawn and btrfs snapshots:
Setup
To be able to use systemd-nspawn --ephemeral
, the chroots needs to be btrfs
subvolumes. If you are not running on a btrfs filesystem, you can create one to
run the tests, even on a file:
fallocate -l 1.5G testfile
/usr/sbin/mkfs.btrfs testfile
sudo mount -o loop testfile test_chroots/
I created a script to setup the test environment, here is an extract:
mkdir -p test_chroots cat << EOF > "test_chroots/CACHEDIR.TAG" Signature: 8a477f597d28d172789f06886806bc55 # chroots used for testing transilience, can be regenerated with make-test-chroot EOF btrfs subvolume create test_chroots/buster eatmydata debootstrap --variant=minbase --include=python3,dbus,systemd buster test_chroots/buster
CACHEDIR.TAG
is a nice trick to tell backup software not to bother backing up
the contents of this directory, since it can be easily regenerated.
eatmydata
is optional, and it speeds up debootstrap quite a bit.
Running unittest
with sudo
Here's a simple helper to drop root as soon as possible, and regain it only
when needed. Note that it needs $SUDO_UID
and $SUDO_GID
, that are set by
sudo
, to know which user to drop into:
class ProcessPrivs: """ Drop root privileges and regain them only when needed """ def __init__(self): self.orig_uid, self.orig_euid, self.orig_suid = os.getresuid() self.orig_gid, self.orig_egid, self.orig_sgid = os.getresgid() if "SUDO_UID" not in os.environ: raise RuntimeError("Tests need to be run under sudo") self.user_uid = int(os.environ["SUDO_UID"]) self.user_gid = int(os.environ["SUDO_GID"]) self.dropped = False def drop(self): """ Drop root privileges """ if self.dropped: return os.setresgid(self.user_gid, self.user_gid, 0) os.setresuid(self.user_uid, self.user_uid, 0) self.dropped = True def regain(self): """ Regain root privileges """ if not self.dropped: return os.setresuid(self.orig_suid, self.orig_suid, self.user_uid) os.setresgid(self.orig_sgid, self.orig_sgid, self.user_gid) self.dropped = False @contextlib.contextmanager def root(self): """ Regain root privileges for the duration of this context manager """ if not self.dropped: yield else: self.regain() try: yield finally: self.drop() @contextlib.contextmanager def user(self): """ Drop root privileges for the duration of this context manager """ if self.dropped: yield else: self.drop() try: yield finally: self.regain() privs = ProcessPrivs() privs.drop()
As soon as this module is loaded, root privileges are dropped, and can be regained for as little as possible using a handy context manager:
with privs.root(): subprocess.run(["systemd-run", ...], check=True, capture_output=True)
Using the chroot from test cases
The infrastructure to setup and spin down ephemeral machine is relatively simple, once one has worked out the nspawn incantations:
class Chroot: """ Manage an ephemeral chroot """ running_chroots: Dict[str, "Chroot"] = {} def __init__(self, name: str, chroot_dir: Optional[str] = None): self.name = name if chroot_dir is None: self.chroot_dir = self.get_chroot_dir(name) else: self.chroot_dir = chroot_dir self.machine_name = f"transilience-{uuid.uuid4()}" def start(self): """ Start nspawn on this given chroot. The systemd-nspawn command is run contained into its own unit using systemd-run """ unit_config = [ 'KillMode=mixed', 'Type=notify', 'RestartForceExitStatus=133', 'SuccessExitStatus=133', 'Slice=machine.slice', 'Delegate=yes', 'TasksMax=16384', 'WatchdogSec=3min', ] cmd = ["systemd-run"] for c in unit_config: cmd.append(f"--property={c}") cmd.extend(( "systemd-nspawn", "--quiet", "--ephemeral", f"--directory={self.chroot_dir}", f"--machine={self.machine_name}", "--boot", "--notify-ready=yes")) log.info("%s: starting machine using image %s", self.machine_name, self.chroot_dir) log.debug("%s: running %s", self.machine_name, " ".join(shlex.quote(c) for c in cmd)) with privs.root(): subprocess.run(cmd, check=True, capture_output=True) log.debug("%s: started", self.machine_name) self.running_chroots[self.machine_name] = self def stop(self): """ Stop the running ephemeral containers """ cmd = ["machinectl", "terminate", self.machine_name] log.debug("%s: running %s", self.machine_name, " ".join(shlex.quote(c) for c in cmd)) with privs.root(): subprocess.run(cmd, check=True, capture_output=True) log.debug("%s: stopped", self.machine_name) del self.running_chroots[self.machine_name] @classmethod def create(cls, chroot_name: str) -> "Chroot": """ Start an ephemeral machine from the given master chroot """ res = cls(chroot_name) res.start() return res @classmethod def get_chroot_dir(cls, chroot_name: str): """ Locate a master chroot under test_chroots/ """ chroot_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "test_chroots", chroot_name)) if not os.path.isdir(chroot_dir): raise RuntimeError(f"{chroot_dir} does not exists or is not a chroot directory") return chroot_dir # We need to use atextit, because unittest won't run # tearDown/tearDownClass/tearDownModule methods in case of KeyboardInterrupt # and we need to make sure to terminate the nspawn containers at exit @atexit.register def cleanup(): # Use a list to prevent changing running_chroots during iteration for chroot in list(Chroot.running_chroots.values()): chroot.stop()
And here's a TestCase
mixin that starts a containerized systems and opens a Mitogen
connection to it:
class ChrootTestMixin: """ Mixin to run tests over a setns connection to an ephemeral systemd-nspawn container running one of the test chroots """ chroot_name = "buster" @classmethod def setUpClass(cls): super().setUpClass() import mitogen from transilience.system import Mitogen cls.broker = mitogen.master.Broker() cls.router = mitogen.master.Router(cls.broker) cls.chroot = Chroot.create(cls.chroot_name) with privs.root(): cls.system = Mitogen( cls.chroot.name, "setns", kind="machinectl", python_path="/usr/bin/python3", container=cls.chroot.machine_name, router=cls.router) @classmethod def tearDownClass(cls): super().tearDownClass() cls.system.close() cls.broker.shutdown() cls.chroot.stop()
Running tests
Once the tests are set up, everything goes on as normal, except one needs to
run nose2
with sudo:
sudo nose2-3
Spin up time for containers is pretty fast, and the tests drop root as soon as possible, and only regain it for as little as needed.
Also, dependencies for all this are minimal and available on most systems, and the setup instructions seem pretty straightforward