Skip to content

Conversation

@bugness-chl
Copy link

Hello,

this pull-request is a work in progress. Feel free to comment, and overtake it or discard it after a week without activity.

It adds some more systemd exceptions in the lxc.generator script for the program nullmailer in Debian 13 Trixie.

I think I tested with a pretty standard configuration, both in privileged and unprivileged containers , except for lxc.apparmor.profile where the generated setting fails with unprivileged containers (even with /usr/sbin/apparmor_parser in the $PATH):

#lxc.apparmor.profile = generated
lxc.apparmor.profile = lxc-container-default-cgns

I am still studying the reason of the if is_lxc_privileged_container... at line 102-104 because unprivileged containers also need those configurations and I tend to replace it with a if true; then (hoping it doesn't decrease security too much...)

And, for reference, here is the (heavily sandboxed) nullmailer's systemd service file on Debian 13:

[Unit]
Description=Nullmailer relay-only MTA
After=network.target
RequiresMountsFor=/var/spool/nullmailer
ConditionPathExists=/var/spool/nullmailer/queue
Documentation=man:nullmailer(7)

[Service]
WorkingDirectory=/var/spool/nullmailer
ExecStart=/usr/sbin/nullmailer-send
User=mail
Group=mail
Restart=always
SyslogFacility=mail

# Sandboxing
CapabilityBoundingSet=
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
PrivateDevices=yes
PrivateMounts=yes
PrivateTmp=yes
PrivateUsers=yes
ProtectClock=yes
ProtectControlGroups=yes
ProtectHome=yes
ProtectHostname=yes
ProtectKernelLogs=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
ProtectProc=invisible
ProtectSystem=strict
ReadWriteDirectories=-/var/log
ReadWriteDirectories=-/var/run
ReadWriteDirectories=-/var/spool/nullmailer
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes

[Install]
WantedBy=multi-user.target

Those exceptions are for nullmailer on Debian 13.

Signed-off-by: Ch. Larose <[email protected]>
@stgraber
Copy link
Member

I think there's possibly a difference in handling between Incus and LXC.

I suspect the dynamic AppArmor profile generated for unprivileged containers on Incus is more permissive and doesn't need those workarounds unless the container is running in privileged mode and has therefore a stricter AppArmor profile.

I like keeping the workarounds to a minimum, so if it's an issue with LXC specifically, we should tweak the conditions in the script to detect that specifically (possibly by looking at /proc/self/attr/current) and only applying the extra rules in that case.

@bugness-chl
Copy link
Author

Sorry for the delay, I'll close the pull-request.

I'm afraid I miss the time to investigate the interactions between lxc, apparmor (and its generated profile) and systemd.

In case it can help someone struggling on a case close to this one, I'll add some of my findings here.

First, here is my setup on a Debian 13 Trixie (amd64) with uid 1000 and subuid 100000:65536 in /etc/subuid and /etc/subgid :

# Install lxc and give ourselves some quota for network interfaces
sudo apt install lxc
printf "%s\tveth\tlxcbr0\t10\n" "$USER" | sudo tee -a /etc/lxc/lxc-usernet

# uid 100000 needs access to ~/.local/share/lxc...
chmod o+x ~/
mkdir -p ~/.config/lxc
cat <<EOF >> ~/.config/lxc/default.conf
lxc.include = /etc/lxc/default.conf
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536
EOF

Next, we create and launch a downloaded Debian Trixie :

lxc-create -t download -n test1 -- -d debian -r trixie -a amd64
# default 'generated' profile doesn't work.
echo "lxc.apparmor.profile = lxc-container-default-cgns" >> ~/.local/share/lxc/test1/config

lxc-unpriv-start -F -n test1
#Running as unit: run-p6959-i7259.scope; invocation ID: 905f755675474b4bbdcefe8429e40bee
#systemd 257.8-1~deb13u2 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +IPE +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBCRYPTSETUP_PLUGINS +LIBFDISK +PCRE2 +PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK +BTF -XKBCOMMON -UTMP +SYSVINIT +LIBARCHIVE)
#Detected virtualization lxc.
#Detected architecture x86-64.
#Detected first boot.
#...

This will output a console log with several errors in it, mainly related to systemd namespacing. Here are a few of my hacks :

sudo patch -d $HOME/.local/share/lxc/test1/rootfs/etc/systemd/system-generators -p0 <<'EOF'
--- lxc
+++ lxc
@@ -91,15 +91,17 @@ fix_systemd_override_unit() {
 		echo "[Service]";
 		[ "${SYSTEMD}" -ge 247 ] && echo "ProcSubset=all";
 		[ "${SYSTEMD}" -ge 247 ] && echo "ProtectProc=default";
+		[ "${SYSTEMD}" -ge 232 ] && echo "PrivateUsers=no";
 		[ "${SYSTEMD}" -ge 232 ] && echo "ProtectControlGroups=no";
 		[ "${SYSTEMD}" -ge 232 ] && echo "ProtectKernelTunables=no";
 		[ "${SYSTEMD}" -ge 239 ] && echo "NoNewPrivileges=no";
+		[ "${SYSTEMD}" -ge 239 ] && echo "PrivateMounts=no";
 		[ "${SYSTEMD}" -ge 249 ] && echo "LoadCredential=";
 		[ "${SYSTEMD}" -ge 254 ] && echo "PrivateNetwork=no";
 		[ "${SYSTEMD}" -ge 256 ] && echo "ImportCredential=";
 
 		# Additional settings for privileged containers
-		if is_lxc_privileged_container; then
+		#if is_lxc_privileged_container; then
 			echo "ProtectHome=no";
 			echo "ProtectSystem=no";
 			echo "PrivateDevices=no";
@@ -108,7 +110,7 @@
 			[ "${SYSTEMD}" -ge 232 ] && echo "ProtectKernelModules=no";
 			[ "${SYSTEMD}" -ge 231 ] && echo "ReadWritePaths=";
 			[ "${SYSTEMD}" -ge 254 ] && [ "${SYSTEMD}" -lt 256 ] && echo "ImportCredential=";
-		fi
+		#fi
 
 		true;
 	} > "${dropin_dir}/zzz-lxc-service.conf"
EOF

cat <<EOF >> ~/.local/share/lxc/test1/config
lxc.mount.entry = mqueue dev/mqueue mqueue nosuid,noexec,nodev,create=dir 0 0
EOF

And more radicals ones :)

sed -i 's/^lxc.apparmor.profile =.*/lxc.apparmor.profile = unconfined/' ~/.local/share/lxc/test1/config
lxc-stop -n test1
lxc-unpriv-start -n test1
lxc-unpriv-attach -n test1 <<'EOF'
apt install -y ifupdown
printf "auto eth0\niface eth0 inet dhcp\n" >> /etc/network/interfaces
apt purge -y systemd-resolved
systemctl disable systemd-networkd
systemctl mask tmp.mount
systemctl mask sys-kernel-config.mount sys-kernel-debug.mount
EOF
lxc-stop -n test1
sed -i 's/^lxc.apparmor.profile =.*/lxc.apparmor.profile = lxc-container-default-cgns/' ~/.local/share/lxc/test1/config
lxc-unpriv-start -F -n test1
# (nearly) no more error messages in the boot log \o/

Another way is to debootstrap instead of downloading (a part of the difference is that systemd-networkd and systemd-resolved are not installed by debootstrap) :

# if 'lxc-templates' is not installed, you get the error: ...get_template_path: 1010 No such file or directory - bad template: debian
sudo apt install lxc-templates

sudo lxc-create -n template-debian-13-trixie -P $HOME/.local/share/lxc -t debian -- -r trixie
sudo mkdir $HOME/.local/share/lxc/template-debian-13-trixie/rootfs/etc/systemd/system-generators/
sudo wget -O $HOME/.local/share/lxc/template-debian-13-trixie/rootfs/etc/systemd/system-generators/lxc "https://sources.debian.org/data/main/d/distrobuilder/3.2-2/distrobuilder/lxc.generator"
sudo chmod +x $HOME/.local/share/lxc/template-debian-13-trixie/rootfs/etc/systemd/system-generators/lxc
sudo patch -d $HOME/.local/share/lxc/template-debian-13-trixie/rootfs/etc/systemd/system-generators -p0 <<'EOF'
--- lxc
+++ lxc
@@ -91,15 +91,17 @@ fix_systemd_override_unit() {
 		echo "[Service]";
 		[ "${SYSTEMD}" -ge 247 ] && echo "ProcSubset=all";
 		[ "${SYSTEMD}" -ge 247 ] && echo "ProtectProc=default";
+		[ "${SYSTEMD}" -ge 232 ] && echo "PrivateUsers=no";
 		[ "${SYSTEMD}" -ge 232 ] && echo "ProtectControlGroups=no";
 		[ "${SYSTEMD}" -ge 232 ] && echo "ProtectKernelTunables=no";
 		[ "${SYSTEMD}" -ge 239 ] && echo "NoNewPrivileges=no";
+		[ "${SYSTEMD}" -ge 239 ] && echo "PrivateMounts=no";
 		[ "${SYSTEMD}" -ge 249 ] && echo "LoadCredential=";
 		[ "${SYSTEMD}" -ge 254 ] && echo "PrivateNetwork=no";
 		[ "${SYSTEMD}" -ge 256 ] && echo "ImportCredential=";
 
 		# Additional settings for privileged containers
-		if is_lxc_privileged_container; then
+		#if is_lxc_privileged_container; then
 			echo "ProtectHome=no";
 			echo "ProtectSystem=no";
 			echo "PrivateDevices=no";
@@ -108,7 +110,7 @@
 			[ "${SYSTEMD}" -ge 232 ] && echo "ProtectKernelModules=no";
 			[ "${SYSTEMD}" -ge 231 ] && echo "ReadWritePaths=";
 			[ "${SYSTEMD}" -ge 254 ] && [ "${SYSTEMD}" -lt 256 ] && echo "ImportCredential=";
-		fi
+		#fi
 
 		true;
 	} > "${dropin_dir}/zzz-lxc-service.conf"
EOF

cat >/tmp/shiftuid.sh <<-'EOF'
#!/bin/bash

export SUBUID="$2"
export SUBGID="$3"
find "$1" -print0 | while IFS= read -r -d '' i; do CURRENT_UID=$(stat --format=%u "$i"); CURRENT_GID=$(stat --format=%g "$i"); NEW_SUBUID=$((CURRENT_UID+SUBUID)); NEW_SUBGID=$((CURRENT_GID+SUBGID)); [ $CURRENT_UID -lt $SUBUID ] && [ $CURRENT_GID -lt $SUBGID ] && chown -h $NEW_SUBUID:$NEW_SUBGID "$i" || echo "uid or gid already changed for $i ?"; done
EOF
sudo bash /tmp/shiftuid.sh $HOME/.local/share/lxc/template-debian-13-trixie/rootfs 100000 100000
rm /tmp/shiftuid.sh
sudo chown $USER:100000 $HOME/.local/share/lxc/template-debian-13-trixie
sudo chown $USER:$USER  $HOME/.local/share/lxc/template-debian-13-trixie/config
sudo chmod 0710 $HOME/.local/share/lxc/template-debian-13-trixie

sed -i 's/^lxc.apparmor.profile =.*/lxc.apparmor.profile = lxc-container-default-cgns/' ~/.local/share/lxc/template-debian-13-trixie/config
cat <<EOF >> ~/.local/share/lxc/template-debian-13-trixie/config
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536
lxc.mount.entry = mqueue dev/mqueue mqueue nosuid,noexec,nodev,create=dir 0 0
EOF

# One can then enjoy the template via a simple:
lxc-copy -n template-debian-13-trixie -N project1234-dev

Another variant to test may be Devuan 6, since it recently got published:

lxc-create -t download -n test3 -- -d devuan -r excalibur -a amd64

# default 'generated' profile doesn't work.
echo "lxc.apparmor.profile = lxc-container-default-cgns" >> ~/.local/share/lxc/test3/config

...if we can ignore the mount/tmpfs and sysctl errors.

The debootstrap method fails with Devuan 6 Excalibur due to some weird ssh/systemd error at the moment :

# doesn't work ?
#test -f /usr/share/debootstrap/scripts/excalibur || sudo wget -O /usr/share/debootstrap/scripts/excalibur https://git.devuan.org/devuan/debootstrap/src/commit/b63f6a74bec5e9c5210314c462be993284c21111/scripts/ceres
# doesn't work either, it fails on ssh
cd /usr/share/debootstrap/scripts && sudo ln -s ceres excalibur
sudo lxc-create -t devuan -n devuan-via-debootstrap-2 -P $HOME/.local/share/lxc -- -r excalibur

@bugness-chl bugness-chl closed this Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants