Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pre-commit hooks in tox configuration #2057

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Use pre-commit hooks in tox configuration #2057

wants to merge 10 commits into from

Conversation

waxlamp
Copy link
Member

@waxlamp waxlamp commented Oct 23, 2024

This extends #2045 to use the same pre-commit hooks to perform linting and formatting via tox.ini. In particular, this "refactors" the specification of what versions of codespell and ruff to use to just one place: the pre-commit config file.

The variants of the pre-commit hooks that are needed to replace the tox.ini commands are listed in the pre-commit config using the manual stage specifier with appropriate aliases, meaning they will not be invoked as a git hook ever, and enabling them to be invoked by name in the tox config.

This PR also modifies the dev Dockerfile to build it to run as the user from the host system. That is, instead of running as root inside the container, it will instead run as the same (non-root) user who is doing development work (including building this container, etc.). That enables the use of pre-commit both inside and outside the container, since pre-commit requires examining Git status, etc. Without this modification, you get errors about "dubious ownership" of the Git repository.

I am totally willing to peel off the Dockerfile changes to a separate PR if we require further discussion, especially if some of these ideas are suitable for general use with Docker containers. Let me know what you think, @mvandenburgh.

@yarikoptic
Copy link
Member

There are some quirks w.r.t. running this via Docker:

  1. Because the permissions for the .git directory are not those of root,

IMHO nobody should run dev tools as root -- should be mapped to the real UID via --user option etc. (in https://github.com/ReproNim/containers/blob/master/scripts/singularity_cmd#L198C1-L199C19 we passed UID and GID env vars) -- what container we are talking about to try on?

@mvandenburgh
Copy link
Member

Should the difficulties around running linting inside docker be documented in DEVELOPMENT.md?

@yarikoptic
Copy link
Member

when do git and tox need to run inside docker environment?

@waxlamp
Copy link
Member Author

waxlamp commented Oct 25, 2024

IMHO nobody should run dev tools as root -- should be mapped to the real UID via --user option etc. (in https://github.com/ReproNim/containers/blob/master/scripts/singularity_cmd#L198C1-L199C19 we passed UID and GID env vars) -- what container we are talking about to try on?

Nobody is running dev tools as root 🙂. Docker containers, by default, run as root inside the container (even when launched by a non-root user on the host system). Your example is using Singularity, which was designed in part to work with non-root users by default.

when do git and tox need to run inside docker environment?

Tox needs to run inside the container precisely when the dev wants it to. For example, I often run the command docker compose run --rm django tox -e lint etc. to perform linting in my dev environment. Git does not need to run inside the container, though pre-commit uses git to determine how to run its hooks, which is where I was running into trouble.

Resonant applications provides this container as a convenience. Adopting pre-commit in dandi-archive caused some of the "rev matching" problems to surface, so I'm looking into how to fix it to provide a generally better experience across Resonant apps.

In any case, I am working on a way to build the container so that it runs as the same user as the dev who built it. If that works, then it will provide parity between host OS and Docker container. Stay tuned for updates.

@yarikoptic
Copy link
Member

There are some quirks w.r.t. running this via Docker:

  1. Because the permissions for the .git directory are not those of root,

IMHO nobody should run dev tools as root -- should be mapped to the real UID via --user option etc. (in https://github.com/ReproNim/containers/blob/master/scripts/singularity_cmd#L198C1-L199C19 we passed UID and GID env vars) -- what container we are talking about to try on?

FWIW -- checked that it works:

❯ docker compose run --rm --user "$(id -u)" django touch 123
WARN[0000] /home/yoh/proj/dandi/dandi-archive/docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion 
WARN[0000] /home/yoh/proj/dandi/dandi-archive/docker-compose.override.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion 
[+] Creating 3/0
 ✔ Container dandi-archive-minio-1     Running                                                                                                                                 0.0s 
 ✔ Container dandi-archive-rabbitmq-1  Running                                                                                                                                 0.0s 
 ✔ Container dandi-archive-postgres-1  Running                                                                                                                                 0.0s 

❯ ls -ld 123
-rw-r--r-- 1 yoh root 0 Oct 28 12:18 123

so just add --user "$(id -u)" to your invocation of tox within container. Depending on scripts, you might also benefit from passing env vars as I pointed our so more complete would be --user "$(id -u)" -e "UID=$(id -u)" -e "GID=$(id -g)" (potentially improved to your liking)

@yarikoptic
Copy link
Member

note: might need to clean up "root owned" artifacts from prior runs of tox first, and then it seems to work just fine

❯ sudo rm -rf .tox/lint/log .ruff_cache
❯ docker compose run --rm --user "$(id -u)" -e "UID=$(id -u)" -e "GID=$(id -g)" django tox -e lint
WARN[0000] /home/yoh/proj/dandi/dandi-archive/docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion 
WARN[0000] /home/yoh/proj/dandi/dandi-archive/docker-compose.override.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion 
[+] Creating 3/0
 ✔ Container dandi-archive-postgres-1  Running                                                                                                                                 0.0s 
 ✔ Container dandi-archive-minio-1     Running                                                                                                                                 0.0s 
 ✔ Container dandi-archive-rabbitmq-1  Running                                                                                                                                 0.0s 
lint: commands[0]> ruff --version
ruff 0.6.9
lint: commands[1]> ruff check
All checks passed!
lint: commands[2]> ruff format --check
155 files already formatted
lint: commands[3]> codespell .
  lint: OK (0.34=setup[0.06]+cmd[0.01,0.02,0.02,0.23] seconds)
  congratulations :) (0.44 seconds)

@waxlamp
Copy link
Member Author

waxlamp commented Oct 31, 2024

so just add --user "$(id -u)" to your invocation of tox within container

This mostly works--depending on how the container image was built you can still run into problems. Plus, having to add the --user flag on every invocation is annoying and error-prone (you might forget to do it, etc.)

note: might need to clean up "root owned" artifacts from prior runs of tox first, and then it seems to work just fine

This is really the core of the issue--having to remember the --user flag, plus the fact that your root user might be "polluting" your dev directory makes for an unergonomic dev experience.

But I'm experimenting with moving these things to the build phase so that the regular commands "just work". Stand by, I have an incoming fix for all this (I hope).

@yarikoptic
Copy link
Member

yarikoptic commented Oct 31, 2024

note: might need to clean up "root owned" artifacts from prior runs of tox first, and then it seems to work just fine

This is really the core of the issue--having to remember the --user flag, plus the fact that your root user might be "polluting" your dev directory makes for an unergonomic dev experience.

The core issue is :

IMHO nobody should run dev tools as root -- should be mapped to the real UID via --user option etc. (in https://github.com/ReproNim/containers/blob/master/scripts/singularity_cmd#L198C1-L199C19 we passed UID and GID env vars) -- what container we are talking about to try on?

Nobody is running dev tools as root 🙂. Docker containers, by default, run as root inside the container (even when launched by a non-root user on the host system).

and the fact that "Docker containers" are not animate, they do not run themselves. They are executed by people who run them and as a result running dev tools as root. It is just a fact.

Tox needs to run inside the container precisely when the dev wants it to.

It is something to remember about and to not execute docker with any dev tools/scripts without precautions being taken -- ideally to avoid that generally. Same advice for sudo - too many people often did sudo pip install causing all kinds of pains later on -- all of that is IMHO just a bad practice.

In the longer run it might be better to look to switch to podman and having fake root inside the container space so touched bind-mounted files would still be fine.

I have expressed my opinion ("never run dev tools as root" - whether through docker or sudo or ...), and will quiet down on this particular issue.

@waxlamp
Copy link
Member Author

waxlamp commented Oct 31, 2024

note: might need to clean up "root owned" artifacts from prior runs of tox first, and then it seems to work just fine

This is really the core of the issue--having to remember the --user flag, plus the fact that your root user might be "polluting" your dev directory makes for an unergonomic dev experience.

The core issue is :

IMHO nobody should run dev tools as root -- should be mapped to the real UID via --user option etc. (in https://github.com/ReproNim/containers/blob/master/scripts/singularity_cmd#L198C1-L199C19 we passed UID and GID env vars) -- what container we are talking about to try on?

Nobody is running dev tools as root 🙂. Docker containers, by default, run as root inside the container (even when launched by a non-root user on the host system).

and the fact that "Docker containers" are not animate, they do not run themselves. They are executed by people who run them and as a result running dev tools as root. It is just a fact.

Tox needs to run inside the container precisely when the dev wants it to.

It is something to remember about and to not execute docker with any dev tools/scripts without precautions being taken -- ideally to avoid that generally. Same advice for sudo - too many people often did sudo pip install causing all kinds of pains later on -- all of that is IMHO just a bad practice.

In the longer run it might be better to look to switch to podman and having fake root inside the container space so touched bind-mounted files would still be fine.

I have expressed my opinion ("never run dev tools as root" - whether through docker or sudo or ...), and will quiet down on this particular issue.

I did not mean to be dismissive--I was just pointing out a reality of how most people run Docker containers. As I said, I'm working on a solution that addresses this issue head on (it is similar to what you suggested, but even better).

@waxlamp waxlamp force-pushed the detox branch 3 times, most recently from 58fb6cb to 5e90b4c Compare November 5, 2024 15:27
DEVELOPMENT.md Outdated Show resolved Hide resolved
DEVELOPMENT.md Outdated Show resolved Hide resolved
Copy link
Member

@mvandenburgh mvandenburgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this when I run the docker compose build command. I think the adduser command doesn't support the . character in usernames (my username on my ubuntu system is michael.vandenburgh).

[+] Building 1.2s (11/33)                                                                                                                                          docker:default
 => [celery internal] load build definition from django.Dockerfile                                                                                                           0.0s
 => => transferring dockerfile: 2.63kB                                                                                                                                       0.0s
 => [django internal] load build definition from django.Dockerfile                                                                                                           0.0s
 => => transferring dockerfile: 2.63kB                                                                                                                                       0.0s
 => [celery internal] load metadata for docker.io/library/python:3.11-slim                                                                                                   0.6s
 => [celery internal] load .dockerignore                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                              0.0s
 => [django internal] load .dockerignore                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                              0.0s
 => [celery internal] load build context                                                                                                                                     0.4s
 => => transferring context: 657.14kB                                                                                                                                        0.4s
 => [django  1/16] FROM docker.io/library/python:3.11-slim@sha256:5148c0e4bbb64271bca1d3322360ebf4bfb7564507ae32dd639322e4952a6b16                                           0.0s
 => CACHED [django  2/16] RUN apt-get update &&     apt-get install --no-install-recommends --yes     libpq-dev gcc libc6-dev git &&     rm -rf /var/lib/apt/lists/*         0.0s
 => CACHED [django  3/16] RUN pip install pre-commit                                                                                                                         0.0s
 => ERROR [django  4/16] RUN if [ "692589655" != 0 ]; then adduser --uid 692589655 --gid 692584961 --home /home/michael.vandenburgh michael.vandenburgh; fi                  0.5s
 => [django internal] load build context                                                                                                                                     0.4s
 => => transferring context: 657.14kB                                                                                                                                        0.4s
------
 > [django  4/16] RUN if [ "692589655" != 0 ]; then adduser --uid 692589655 --gid 692584961 --home /home/michael.vandenburgh michael.vandenburgh; fi:
0.489 adduser: Please enter a username matching the regular expression
0.489             configured via the NAME_REGEX configuration variable.  Use the
0.489             `--allow-bad-names' option to relax this check or reconfigure
0.489             NAME_REGEX in configuration.
------
failed to solve: process "/bin/sh -c if [ \"${USERID}\" != 0 ]; then adduser --uid ${USERID} --gid ${GROUPID} --home /home/${LOGIN} $LOGIN; fi" did not complete successfully: exit code: 1

The man page for adduser says you can use --allow-bad-names to override this, but then I get this error:

 > [celery  4/16] RUN if [ "692589655" != 0 ]; then adduser --uid 692589655 --gid 692584961 --home /home/michael.vandenburgh michael.vandenburgh --allow-bad-names; fi:
0.361 adduser: The GID 692584961 does not exist.

dev/django.Dockerfile Outdated Show resolved Hide resolved
dev/django.Dockerfile Outdated Show resolved Hide resolved
@waxlamp
Copy link
Member Author

waxlamp commented Nov 11, 2024

I get this when I run the docker compose build command. I think the adduser command doesn't support the . character in usernames (my username on my ubuntu system is michael.vandenburgh).

[+] Building 1.2s (11/33)                                                                                                                                          docker:default
 => [celery internal] load build definition from django.Dockerfile                                                                                                           0.0s
 => => transferring dockerfile: 2.63kB                                                                                                                                       0.0s
 => [django internal] load build definition from django.Dockerfile                                                                                                           0.0s
 => => transferring dockerfile: 2.63kB                                                                                                                                       0.0s
 => [celery internal] load metadata for docker.io/library/python:3.11-slim                                                                                                   0.6s
 => [celery internal] load .dockerignore                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                              0.0s
 => [django internal] load .dockerignore                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                              0.0s
 => [celery internal] load build context                                                                                                                                     0.4s
 => => transferring context: 657.14kB                                                                                                                                        0.4s
 => [django  1/16] FROM docker.io/library/python:3.11-slim@sha256:5148c0e4bbb64271bca1d3322360ebf4bfb7564507ae32dd639322e4952a6b16                                           0.0s
 => CACHED [django  2/16] RUN apt-get update &&     apt-get install --no-install-recommends --yes     libpq-dev gcc libc6-dev git &&     rm -rf /var/lib/apt/lists/*         0.0s
 => CACHED [django  3/16] RUN pip install pre-commit                                                                                                                         0.0s
 => ERROR [django  4/16] RUN if [ "692589655" != 0 ]; then adduser --uid 692589655 --gid 692584961 --home /home/michael.vandenburgh michael.vandenburgh; fi                  0.5s
 => [django internal] load build context                                                                                                                                     0.4s
 => => transferring context: 657.14kB                                                                                                                                        0.4s
------
 > [django  4/16] RUN if [ "692589655" != 0 ]; then adduser --uid 692589655 --gid 692584961 --home /home/michael.vandenburgh michael.vandenburgh; fi:
0.489 adduser: Please enter a username matching the regular expression
0.489             configured via the NAME_REGEX configuration variable.  Use the
0.489             `--allow-bad-names' option to relax this check or reconfigure
0.489             NAME_REGEX in configuration.
------
failed to solve: process "/bin/sh -c if [ \"${USERID}\" != 0 ]; then adduser --uid ${USERID} --gid ${GROUPID} --home /home/${LOGIN} $LOGIN; fi" did not complete successfully: exit code: 1

The man page for adduser says you can use --allow-bad-names to override this, but then I get this error:

 > [celery  4/16] RUN if [ "692589655" != 0 ]; then adduser --uid 692589655 --gid 692584961 --home /home/michael.vandenburgh michael.vandenburgh --allow-bad-names; fi:
0.361 adduser: The GID 692584961 does not exist.

Thanks for the edge cases (regarding username nonconformance and group existence). Commits d7026cc and 2ada467 should help.

DEVELOPMENT.md Outdated Show resolved Hide resolved
Comment on lines +21 to +22
RUN getent group ${GROUPID} || addgroup --gid ${GROUPID} ${GROUPNAME}
RUN getent passwd ${USERID} || adduser --allow-bad-names --uid ${USERID} --gid ${GROUPID} --home /home/${LOGIN} $LOGIN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I get

 => ERROR [django  3/16] RUN getent group 692584961 || addgroup --gid 692584961 domain^users                                                0.4s
------                                                                                                                                           
 > [django  3/16] RUN getent group 692584961 || addgroup --gid 692584961 domain^users:                                                           
0.386 addgroup: To avoid problems, the username should consist only of                                                                           
0.386             letters, digits, underscores, periods, at signs and dashes, and                                                                
0.386             not start with a dash (as defined by IEEE Std 1003.1-2001). For                                                                
0.386             compatibility with Samba machine accounts, $ is also supported                                                                 
0.386             at the end of the username.  (Use the `--allow-all-names' option
0.386             to bypass this restriction.)

It seems like we also need to allow "bad" group names. Adding --allow-all-names as the error message suggests, seems to fix it.

Suggested change
RUN getent group ${GROUPID} || addgroup --gid ${GROUPID} ${GROUPNAME}
RUN getent passwd ${USERID} || adduser --allow-bad-names --uid ${USERID} --gid ${GROUPID} --home /home/${LOGIN} $LOGIN
RUN getent group ${GROUPID} || addgroup --allow-all-names --gid ${GROUPID} ${GROUPNAME}
RUN getent passwd ${USERID} || adduser --allow-bad-names --uid ${USERID} --gid ${GROUPID} --home /home/${LOGIN} $LOGIN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants