Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes container pods fail with EACESS when using a custom user #3290

Open
kwohlfahrt opened this issue May 16, 2024 · 3 comments
Open
Labels
bug Something isn't working

Comments

@kwohlfahrt
Copy link

Describe the bug
I am running the actions-runner-controller to host GitHub workflows. When I combine this with a container action where the container user is neither root nor the same as the runner user (UID 1001), the run fails with EACCES: permission denied, open '/__w/_temp/_runner_file_commands/set_env_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c'.

The issue persists, even if I set fsGroup: 1001 on both the runner and the workflow container. This is because the runner pre-creates the output files with -rw-r--r-- permissions, so group membership is insufficient for writes:

$ ls -l /home/runner/_work/_temp/_runner_file_commands:
total 0
-rw-r--r-- 1 runner runner 0 May 14 17:23 add_path_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 add_path_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 save_state_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 save_state_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_env_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_env_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_output_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 set_output_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c
-rw-r--r-- 1 runner runner 0 May 14 17:23 step_summary_3da96eb5-2ed4-41a6-b402-c9f1be15a554
-rw-r--r-- 1 runner runner 0 May 14 17:23 step_summary_8e7dea0f-bec9-4fd6-9b11-824b0bb16a6c

If I set runAsUser: 1001 on the workflow container, the run gets further, but eventually (expectedly) fails because our image assumes the runtime user is the same as the user the image was built with.

To Reproduce

  1. Deploy the runner controller
  2. Deploy a runner scale-set, using the kubernetes containerMode. Configure spec.securityContext.fsGroup: 1001:
    a. On the runner, using the template property of the Helm chart
    b. On the worker, using ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATE
  3. Launch a workflow in a custom container, that specifies a user that is not root, and not 1001

The full installation manifests are included at the end of this report.

Expected behavior

I expect the workflow container to be able to write its output, if it has the same fsGroup as the runner container. I think the best solution is for the runner container to add group write permissions to the output files it creates.

Runner Version and Platform

Version of your runner? 2.315.0

OS of the machine running the runner? Linux (Ubuntu 22.04) + Kubernetes

What's not working?

Container workflows cannot write their output as expected, if the container sets a custom user.

Job Log Output

Controller and runner pod logs can be found here: https://gist.github.com/kwohlfahrt/1d45d62aa963e4a4eec2ca6b04c2cc19

Runner values.yaml:

containerMode:
  kubernetesModeWorkVolumeClaim:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 14Gi
    storageClassName: exclusive
  type: kubernetes
controllerServiceAccount:
  name: actions-runner-system-d3d990a5
  namespace: actions-runner-system-be6fdde6
githubConfigSecret: actions-runner-81cb830f
githubConfigUrl: https://github.com/CHARM-Tx
maxRunners: 3
minRunners: 1
template:
  spec:
    containers:
    - command:
      - /home/runner/run.sh
      env:
      - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
        value: "false"
      - name: ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATE
        value: /home/runner/templates/worker.yaml
      image: <snip>.dkr.ecr.eu-central-1.amazonaws.com/github-runner:2.315.0
      name: runner
      resources:
        limits:
          cpu: "1"
      volumeMounts:
      - mountPath: /home/runner/templates
        name: templates
    securityContext:
      fsGroup: 1001
    volumes:
    - configMap:
        name: templates-3892142c
      name: templates

templates ConfigMap:

apiVersion: v1
data:
  worker.yaml: '{"spec":{"securityContext":{"fsGroup":1001}}}'
kind: ConfigMap
metadata:
  name: templates-3892142c
  namespace: actions-runner-66769bad
@kwohlfahrt kwohlfahrt added the bug Something isn't working label May 16, 2024
@kwohlfahrt
Copy link
Author

I had previously filed this in the wrong repo, in actions/actions-runner-controller#3517.

@gdubicki
Copy link

In case someone didn't catch this, the workaround for this problem is to force the non-root user that you use in your Docker image to have UID 1001.

@kwohlfahrt
Copy link
Author

kwohlfahrt commented Jun 14, 2024

Unfortunately, I don't think we can apply this workaround. We don't have control over this step of the build, as the base image is from a vendor (which sets up the home directory with some configuration files necessary to make the software run), so we can't set the UID during the build.

Overriding the UID at runtime then also fails, because the permissions associated with the files don't apply to the new UID, as described in the issue. The only thing we can inject is fsGroup, but that doesn't allow the worker to read the GitHub actions files, hence the issue.

We could probably do some recursive chown as part of our build steps, but that's starting to get into quite hacky territory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants