This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.
One day after the first deploy, we went to check how the system was doing, and noticed some fine tuning to do, some pretty much urgent.
Inspecting
Since the system runs on a readonly rootfs with a writable tempfs overlay,
one can inspect the contents of /live/cow
and see exactly what files were
written since the last boot. ncdu -x /live/cow
is a wonderful, wonderful
thing.
In this way, we can quickly identify disk/memory usage leaks, and other possible unexpected surprises, like an unexpectedly updated apt package database.
An unexpectedly updated apt package database, with apt sources that may publish broken software, raised very loud alarm bells.
Disable apt timers
It looks like Raspbian ships with the automatic apt update/upgrade timer services enabled. In our case, that would give us a system that works when turned on, then upgrades overnight, and the next day won't play videos, until rebooted, when the tmpfs overlay will be reset and it will work again, until the next nightly upgrade, and so on.
In other words, a flaky system, that would thankfully fix itself at boot but break one day after booting. A system that would be very hard to debug. A system that would soon lose the trust of its users.
The first hotfix after deployment of Himblick was then to update the provisioning procedure to disable automatic package updates:
systemctl disable apt-daily.timer
systemctl mask apt-daily.timer
systemctl disable apt-daily-upgrade.timer
systemctl mask apt-daily-upgrade.timer
Of course, the first system to be patched was on top of a very tall ladder close to a museum ceiling.
journald disk usage
Logging takes an increasing amount of space. In theory, using a
systemd.volatile
setup, journald does the right thing by default. Since we
need to use dracut's hack instead of systemd.volatile,
we need to take manual steps to bound the amount of disk space used.
Thanfully, it looks easy to fine-tune journald's disk usage
Limit the growth of .xsession-errors
The .xsession-errors
file grows indefinitely during the X session, and it
cannot be rotated without restarting X. Deleting it won't help, as the X
session still has the file open and keeps it allocated and growing on disk.
At most, it can be occasionally truncated.
The file is created by /etc/X11/Xsession
before sourcing other configuration
files, so one cannot override its location with, say, /dev/null
, or a pipe to
some command, without editing the Xsession
script itself.
Still, .xsession-errors
is extremely useful for finding unexpected error
output from X programs when something goes wrong.
In our case, himblick-player
is the only program run in the X
session. We can greatly limit the growth
of .xsession-errors
by making it log to a file instead of stderr, and using
one of python's rotating logging handlers
to limit the amount of Himblick's stored logging, or send himblick's log
directly to journald,
and let journald take care of disk allocation.
Once that is sorted, we can change Himblick to capture the players' stdout and
stderr, and log it, to avoid it going to .xsession-errors
.