Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 24.x "permission denied" in mount(/, /) #236

Closed
mattgodbolt opened this issue Aug 3, 2024 · 5 comments
Closed

Ubuntu 24.x "permission denied" in mount(/, /) #236

mattgodbolt opened this issue Aug 3, 2024 · 5 comments

Comments

@mattgodbolt
Copy link

We have nsjail working on Ubuntu 20.x with cgroupsv2 (despite initially hitting issues around #196); but on an upgraded machine now running 24.x we see this (tail of a log):

[I][2024-08-03T13:09:49-0500] Uid map: inside_uid:10240 outside_uid:1000 count:1 newuidmap:false
[I][2024-08-03T13:09:49-0500] Gid map: inside_gid:10240 outside_gid:1000 count:1 newgidmap:false
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGINT (2)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGQUIT (3)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGUSR1 (10)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGALRM (14)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGCHLD (17)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGTERM (15)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGTTIN (21)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGTTOU (22)
[D][2024-08-03T13:09:49-0500][2054703] setSigHandler():68 Setting sighandler for signal SIGPIPE (13)
[I][2024-08-03T13:09:49-0500] Detected cgroups version: 2
[D][2024-08-03T13:09:49-0500][2054703] runChild():471 Creating new process with clone flags:CLONE_NEWNS|CLONE_NEWCGROUP|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWNET and exit_signal:SIGCHLD
[D][2024-08-03T13:09:49-0500][2054703] addProc():251 Added pid=2054704 with start time 1722708589 to the queue for IP: '[STANDALONE MODE]'
[D][2024-08-03T13:09:49-0500][2054703] createCgroup():64 Create '/sys/fs/cgroup/ce-compile/NSJAIL.2054704' for pid=2054704
[D][2024-08-03T13:09:49-0500][2054703] addPidToProcList():47 Adding pid='2054704' to cgroup.procs
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '7' bytes to '/sys/fs/cgroup/ce-compile/NSJAIL.2054704/cgroup.procs'
[I][2024-08-03T13:09:49-0500] Setting 'memory.max' to '1342177280'
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '10' bytes to '/sys/fs/cgroup/ce-compile/NSJAIL.2054704/memory.max'
[D][2024-08-03T13:09:49-0500][2054703] createCgroup():64 Create '/sys/fs/cgroup/ce-compile/NSJAIL.2054704' for pid=2054704
[D][2024-08-03T13:09:49-0500][2054703] addPidToProcList():47 Adding pid='2054704' to cgroup.procs
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '7' bytes to '/sys/fs/cgroup/ce-compile/NSJAIL.2054704/cgroup.procs'
[I][2024-08-03T13:09:49-0500] Setting 'pids.max' to '72'
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '2' bytes to '/sys/fs/cgroup/ce-compile/NSJAIL.2054704/pids.max'
[D][2024-08-03T13:09:49-0500][2054703] createCgroup():64 Create '/sys/fs/cgroup/ce-compile/NSJAIL.2054704' for pid=2054704
[D][2024-08-03T13:09:49-0500][2054703] addPidToProcList():47 Adding pid='2054704' to cgroup.procs
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '7' bytes to '/sys/fs/cgroup/ce-compile/NSJAIL.2054704/cgroup.procs'
[I][2024-08-03T13:09:49-0500] Setting 'cpu.max' to '1000000 1000000'
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '15' bytes to '/sys/fs/cgroup/ce-compile/NSJAIL.2054704/cpu.max'
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '4' bytes to '/proc/2054704/setgroups'
[D][2024-08-03T13:09:49-0500][2054703] gidMapSelf():171 Writing '10240 1000 1
' to '/proc/2054704/gid_map'
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '13' bytes to '/proc/2054704/gid_map'
[D][2024-08-03T13:09:49-0500][2054703] uidMapSelf():143 Writing '10240 1000 1
' to '/proc/2054704/uid_map'
[D][2024-08-03T13:09:49-0500][2054703] writeBufToFile():115 Written '13' bytes to '/proc/2054704/uid_map'
[D][2024-08-03T13:09:49-0500][1] setResGid():65 setresgid(10240)
[D][2024-08-03T13:09:49-0500][1] initNsFromChild():289 setgroups(0, [])
[D][2024-08-03T13:09:49-0500][1] initNsFromChild():296 setgroups(0, []) failed: Operation not permitted
[D][2024-08-03T13:09:49-0500][1] setResUid():81 setresuid(10240)
[D][2024-08-03T13:09:49-0500][1] mkdirAndTest():296 Created accessible directory in '/run/user/1000/nsjail'
[D][2024-08-03T13:09:49-0500][1] mkdirAndTest():296 Created accessible directory in '/run/user/1000/nsjail/root'
[E][2024-08-03T13:09:49-0500][1] initCloneNs():391 mount('/', '/', NULL, MS_REC|MS_PRIVATE, NULL): Permission denied
[F][2024-08-03T13:09:49-0500][1] runChild():487 Launching child process failed
[W][2024-08-03T13:09:49-0500][2054703] runChild():507 Received error message from the child process before it has been executed
[E][2024-08-03T13:09:49-0500][2054703] standaloneMode():275 Couldn't launch the child process
[D][2024-08-03T13:09:49-0500][2054703] main():376 Returning with 255

seemingly it can't mount the root directory (?) which seems surprising. The command is:

nsjail --verbose --config etc/nsjail/compilers-and-tools.cfg -- /bin/bash

and the referenced cfg file is https://github.com/compiler-explorer/compiler-explorer/blob/main/etc/nsjail/compilers-and-tools.cfg (with the log_level set to DEBUG).

Additionally these commands were run before, to get the cgroups to work:

sudo cgcreate -a $USER:$USER -g memory,pids,cpu:ce-compile
sudo chown $USER:root /sys/fs/cgroup/cgroup.procs
@mattgodbolt
Copy link
Author

I just thought to check dmesg and:

[266713.881047] audit: type=1400 audit(1722708589.045:468): apparmor="AUDIT" operation="userns_create" class="namespace" info="Userns create - transitioning profile" profile="unconfined" pid=2054703 comm="nsjail" requested="userns_create" target="unprivileged_userns"
[266713.893050] audit: type=1400 audit(1722708589.057:469): apparmor="DENIED" operation="capable" class="cap" profile="unprivileged_userns" pid=2054704 comm="nsjail" capability=6  capname="setgid"
[266713.893126] audit: type=1400 audit(1722708589.057:470): apparmor="DENIED" operation="mount" class="mount" info="failed mntpnt match" error=-13 profile="unprivileged_userns" name="/" pid=2054704 comm="nsjail" flags="rw, rprivate"

@mattgodbolt
Copy link
Author

Looks like this came in 23.10: https://ubuntu.com/blog/ubuntu-23-10-restricted-unprivileged-user-namespaces; now trying to work out how to disable it. Leaving this issue open in case folks who have more experience have advice.

@mattgodbolt
Copy link
Author

mattgodbolt commented Aug 3, 2024

Per the above link; this is a workaround:

sudo sysctl -w kernel.apparmor_restrict_unprivileged_unconfined=0
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0

But the general advice is to make an apparmor profile. Perhaps this is something nsjail can do?

@disconnect3d
Copy link
Contributor

Unfortunately nsjail does not support AppArmor profiles at this moment (I believe they would be happy to do so). If you are running things via Docker (I guess you are not, but still maybe worth documenting it here) you can use --security-opt apparmor=unconfined.

I also believe there should be some way to disable AppArmor just for a single process. An alternative is to create an empty profile for it as well.

Some commands from here may be helpful: https://www.cyberciti.biz/faq/ubuntu-linux-howto-disable-apparmor-commands/

@mattgodbolt
Copy link
Author

Thanks @disconnect3d . We're not running in Docker. I'm running on a vanilla install of Ubuntu 24.40 here, with only the setup commands above. An empty profile sounds OK. too; thanks. Just worth knowing about this gotcha (maybe updating some docs somewhere?)

Will close now as the sysctl disable "works" as would disabling AA entirely and probably some kind of per-process disablement too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants