-
Notifications
You must be signed in to change notification settings - Fork 938
Use tar --no-same-owner option for untar module #9338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Might need to run the CI checks again---seems like there were some network and/or disk-full related errors for the self-hosted runners? e.g.: https://github.com/nf-core/modules/actions/runs/19082094294/job/55387492031#step:6:1057 https://github.com/nf-core/modules/actions/runs/19082094294/job/55387491920#step:6:831 https://github.com/nf-core/modules/actions/runs/19082094294/job/55387491895#step:6:803 |
|
I suspect that this change is triggering far more tests than normally run, resulting in disk space exhaustion on the runners. To test this hypothesis, in a fork of nf-core/modules I switched to the GitHub-hosted runners, using the secondary /mnt partition for conda environments, docker containers, and nextflow/nf-test work directories (using the technique described in #7016 (comment)). This resulted in substantially more checks passing: https://github.com/fasrc/modules/actions/runs/19448287416 Though there were still failures---some of which seem to have plausible explanations, maybe indicating they haven't been run in a while? e.g., CELLRANGERARC_MKFASTQ failed in some of the docker and singularity shards due to output differeing from the snapshot (https://github.com/fasrc/modules/actions/runs/19448287416/job/55715338873#step:5:449). However, its results are non-deterministic due to multithreading:
Also, METAPHLAN3_MERGEMETAPHLANTABLES appears to be subject to bit rot: distutils was removed in Python 3.12 (https://peps.python.org/pep-0632/), and isn't going to be present in the Python 3.13 installed in the conda environment. I'm not sure how to proceed with this one; any guidance would be appreciated! |
Addresses CI runners running out of space with most shards
|
Per nf-core slack, I temporarily increased max_shards from 15 to 30. Will revert after tests have run and before PR is merged. |
|
Still running out of space with 30 shards. Bumping max_shards to 60 to see if that's sufficient...? |
|
Try merging the master branch into your branch, that should lower the number of tests again I think! |
@famosab Thanks for the tip! That substantially reduced the number of test failures. There are still some test failures; none of which seem to be related to the change in the tar command proposed in this PR. These are the remaining no-space-left-on-device errors. I could try temporarily bumping up max_shards further? This job seemed to have a network issue ("FATAL: While making image from oci registry: error fetching image to cache: while building SIF from layers: conveyor failed to get: error writing layer: unexpected EOF") while generating a SIF---perhaps a rerun could fix? These tests fail due to metaphlan 3.0.12 bioconda package referencing a Python library that was removed in Python 3.12 (and is not present in the Python 3.13 currently installed by the environment): This issue was apparently fixed in MetaPhlAn 4.2.0: biobakery/MetaPhlAn#232 (comment) Another Metaphlan 3 issue -- possibly related to Python version??? A "different snapshot" error in kofamscan output that I'm able to reproduce on the master branch in a codespace with, e.g. Another different-snapshot error, also reproducible in a codespace on the master branch, using A different snapshot error with harmonization/rgi, reproducible in a codespace on the master branch:
|
|
PR to attempt to fix the metaphlan3_metaphlan3 and metaphlan3_mergemetaphlantables errors: |
The dev container uses the root user in the container:
modules/.devcontainer/devcontainer.json
Line 12 in 113690e
When run as root,
tar -xwill preserve ownership (uid-gid) of files in the tarball upon extraction. This can result in an error when using rootless podman if the dev container localWorkspaceFolder, i.e.:modules/.devcontainer/devcontainer.json
Lines 5 to 8 in 113690e
resides on an NFS file system (see Rootless Podman and NFS for more details); e.g.:
The solution proposed by this PR is to add the GNU tar
--no-same-owneroption to make the extracted files owned by the user that runs the tar command (in the preceding scenario, root in the dev container's user namespace, which is mapped to the user running podman on the host).PR checklist
versions.ymlfile.labelnf-core modules test <MODULE> --profile dockernf-core modules test <MODULE> --profile singularitynf-core modules test <MODULE> --profile condanf-core subworkflows test <SUBWORKFLOW> --profile dockernf-core subworkflows test <SUBWORKFLOW> --profile singularitynf-core subworkflows test <SUBWORKFLOW> --profile conda