Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile: use wrapper script instead of radosgw directly #758

Merged
merged 4 commits into from
Nov 16, 2023

Conversation

jecluis
Copy link
Contributor

@jecluis jecluis commented Oct 14, 2023

This patch set introduces a wrapper script, run-s3gw.sh, responsible for abstracting calling the radosgw binary.

Instead, we define a set of options that are s3gw-specific and translate them to radosgw options as needed.
This ensures consistency in the project, by removing direct references to radosgw.

Additionally, we're adding documentation about advanced usage of the s3gw container, for cases when using the container directly may be needed.

This pull request has a companion pull request in s3gw-tech/s3gw-charts#127.

Fixes: aquarist-labs/s3gw#754

Signed-off-by: Joao Eduardo Luis <[email protected]>

@jecluis jecluis added kind/enhancement Change that positively impacts existing code area/containers area/docs Related with documentation in general labels Oct 14, 2023
@jecluis jecluis added this to the v0.22.0 milestone Oct 14, 2023
@jecluis jecluis requested review from tserong and m-ildefons October 14, 2023 11:09
Copy link
Contributor

@tserong tserong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one tiny typo :-) and the code looks fine, but I have two questions and one small problem. Here's the questions:

  1. It looks like --rgw-frontends can't be overridden, which means there's no way to enable the status frontend. Should this PR also add status to frontend_args? Or will that be done separately?
  2. Why support args and env vars (as opposed to just one or the other)? Do we need to document that args will always override env vars?

The small problem is that since this change there's maybe something janky about signal handling. I ran docker build --file Dockerfile --tag s3gw --target s3gw . to build the thing, then docker run -p7480:7480 s3gw to run it, and it runs just fine. But when I hit CTRL-C in the terminal, nothing happens. It just keeps running. I have to docker kill it from elsewhere (compare with docker run quay.io/s3gw/s3gw:latest, for which CTRL-C kills it just fine).

docs/advanced-usage.md Outdated Show resolved Hide resolved
@jecluis
Copy link
Contributor Author

jecluis commented Oct 16, 2023

I found one tiny typo :-) and the code looks fine, but I have two questions and one small problem. Here's the questions:

1. It looks like `--rgw-frontends` can't be overridden, which means there's no way to enable the status frontend. Should this PR also add `status` to frontend_args? Or will that be done separately?

that could technically be done by providing the argument in the end, after --, but good shout, that should be an option to add.

2. Why support args _and_ env vars (as opposed to just one or the other)? Do we need to document that args will always override env vars?

Honestly, covering all bases. Some might be more inclined to rely on --env FOO=bar or something. Also, I suspect that it might be easier for k8s to use env vars.

The small problem is that since this change there's maybe something janky about signal handling. I ran docker build --file Dockerfile --tag s3gw --target s3gw . to build the thing, then docker run -p7480:7480 s3gw to run it, and it runs just fine. But when I hit CTRL-C in the terminal, nothing happens. It just keeps running. I have to docker kill it from elsewhere (compare with docker run quay.io/s3gw/s3gw:latest, for which CTRL-C kills it just fine).

I thought that would only work if using -it ? If not, maybe a sighandler on the script might suffice. I'll test that out and see how that goes.

Copy link
Contributor

@tserong tserong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good news! CTRL-C works to kill the container now when run from docker (I didn't try podman).

I've found a couple more typos, and one small kink (see review comments below).

Also, while --with-status does work to start the status backend (I see "starting handler: status in the logs) I can't connect to the damn thing when running docker run -d -p7480:7480 -p9090:9090 s3gw --with-status. I just get "connection reset" if I try to hit http://localhost:9090 in my browser. This seems to be because the status backend binds to 127.0.0.1 inside the container by default. If I do this:

 if [[ $with_status -eq 1 ]]; then
-  frontend_args+=", status"
+  frontend_args+=", status bind=0.0.0.0"
 fi

...it works fine. But I'm not sure we should actually make that change here in the runner script. Presumably that should be done in rgw_status_frontend.cc (why does that bind to 127.0.0.1 inside the container? AFAICT that makes in inaccessible to the outside)

"low", "none" (default: ${default_debug_level}).
--no-telemetry Disable telemetry.
--telemetry-url URL Specifies telemetry URL.
--with-status Enables status frontend at 127.0.0.1:9999.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--with-status Enables status frontend at 127.0.0.1:9999.
--with-status Enables status frontend at 127.0.0.1:9090.


options:
--help Shows this message.
--id ID Specifies a custom instance ID (default: ${default_id}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--id ID Specifies a custom instance ID (default: ${default_id}).
--id ID Specifies a custom instance ID (default: s3gw).

--dns-name VALUE For vhost style buckets, VALUE should be the DNS domain
to be used.
--debug LEVEL Runs with debug. Levels available: "high", "medium",
"low", "none" (default: ${default_debug_level}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"low", "none" (default: ${default_debug_level}).
"low", "none" (default: none).

"low", "none" (default: ${default_debug_level}).
--no-telemetry Disable telemetry.
--telemetry-url URL Specifies telemetry URL.
--with-status Enables status frontend at 127.0.0.1:9999.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--with-status Enables status frontend at 127.0.0.1:9999.
--with-status Enables status frontend at 127.0.0.1:9090.

exit 1
}

trap "stop_rgw" EXIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be moved down the bottom, just above when we run radosgw on line 199. Otherwise, if I try docker run s3gw --help, I get the help output, but I also see "kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]" printed to STDERR. This is because --help triggers exit 0 down on line 81, which hits the trap that calls stop_rgw(), which then tries to run kill with $rgw_pid as an empty string. Same with the various other exit cases. Moving trap "stop_rgw" EXIT down after those other exits fixes this.

@jecluis
Copy link
Contributor Author

jecluis commented Oct 23, 2023

...it works fine. But I'm not sure we should actually make that change here in the runner script. Presumably that should be done in rgw_status_frontend.cc (why does that bind to 127.0.0.1 inside the container? AFAICT that makes in inaccessible to the outside)

That's a good question. I always thought it could be exposed even if bound to 127.0.0.1. It's a shame it isn't. I think the rationale is that said sort of information should be consumed by something, not necessarily someone. If we expose it in 0.0.0.0, then we'll likely have to add authentication to it, whereas running on 127.0.0.1 would allow the admin to do shenanigans to get to it if they want to.

In principle, that was it. Given it's not reaching the port from outside the container though, I think the wrapper script needs to handle that. I just think there's no good way around it. Changing the status frontend to bind to 0.0.0.0 defeats the locality purpose of that frontend, and forces us to implement authentication for it.

@jecluis jecluis force-pushed the wip-wrap-container-entrypoint branch from 32865e2 to 7d640c7 Compare October 23, 2023 08:05
@jecluis jecluis requested a review from tserong October 23, 2023 08:05
@tserong
Copy link
Contributor

tserong commented Oct 23, 2023

I found https://stackoverflow.com/questions/52513336/is-there-a-way-to-expose-a-docker-container-port-bound-to-127-0-0-1-to-host#52518929 on stackoverflow which is interesting.

Anyway, I think what you've got here now with status bind=0.0.0.0 in the wrapper is fine, and does what we want, because the admin/user has explicitly decided to run it --with-status. And given what you've said, it probably makes sense to leave the default bind as 127.0.0.1 in the backend code, because that could theoretically be built by someone and run in other non containerized environments. And we can re-visit that decision if necessary later in a separate issue/PR/whatever.

tserong
tserong previously approved these changes Oct 23, 2023
@jecluis
Copy link
Contributor Author

jecluis commented Oct 24, 2023

merge dependent on me following up on s3gw-tech/s3gw-charts#127 -- these should be merged in tandem so we don't break the chart.

@jecluis jecluis modified the milestones: v0.22.0, v0.23.0 Oct 24, 2023
m-ildefons
m-ildefons previously approved these changes Oct 24, 2023
Copy link
Contributor

@m-ildefons m-ildefons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
It's common practice to name an entrypoint script entrypoint.sh

"low", "none" (default: ${default_debug_level}).
--no-telemetry Disable telemetry.
--telemetry-url URL Specifies telemetry URL.
--with-status Enables status frontend at port 9090.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker file says EXPOSE 7481, not 9090

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7481, I think, is the SSL port. I didn't add the 9090 to the dockerfile, wonder if I should even if that's merely for documentation purposes.

@irq0
Copy link
Member

irq0 commented Oct 24, 2023

Heads up. Some more dependencies:

https://github.com/aquarist-labs/s3gw/blob/37bd64e3a277d22c190f6048e59b6d9303ea6dc4/tools/s3tests/runner.py#L58

https://github.com/irq0/vogon2/blob/5772347d8b8dff4c6da8a103402d1f39750e5116/vogon2.py#L130

s3tr needs changes: It doesn't use entrypoint, but command

vogon needs some changes ✔️

@jecluis jecluis self-assigned this Nov 13, 2023
@vmoutoussamy vmoutoussamy added the priority/1 Should be fixed for next release label Nov 15, 2023
@jecluis
Copy link
Contributor Author

jecluis commented Nov 15, 2023

Heads up. Some more dependencies:

https://github.com/aquarist-labs/s3gw/blob/37bd64e3a277d22c190f6048e59b6d9303ea6dc4/tools/s3tests/runner.py#L58

https://github.com/irq0/vogon2/blob/5772347d8b8dff4c6da8a103402d1f39750e5116/vogon2.py#L130

s3tr needs changes: It doesn't use entrypoint, but command

vogon needs some changes ✔️

@irq0 will any of these fail because of these changes? Looking at them it seems that they rely on executing the binary directly, not the entrypoint?

@jecluis jecluis dismissed stale reviews from m-ildefons and tserong via fd35267 November 15, 2023 18:30
@jecluis jecluis force-pushed the wip-wrap-container-entrypoint branch from 7d640c7 to fd35267 Compare November 15, 2023 18:30
@jecluis
Copy link
Contributor Author

jecluis commented Nov 15, 2023

@m-ildefons @tserong rebased and addressed the comments from before. Mind another look?

Signed-off-by: Joao Eduardo Luis <[email protected]>
This abstracts calling on radosgw, limiting the options the user can use
by default, while forcing (unless overridden by the user) all the
parameters we know we need (like the storage driver).

Signed-off-by: Joao Eduardo Luis <[email protected]>
Instead of moving binaries and libraries to /radosgw, lets keep things
in /s3gw instead, as yet another way to abstract what we are using
below.

Signed-off-by: Joao Eduardo Luis <[email protected]>
Fixes: aquarist-labs/s3gw#754

Signed-off-by: Joao Eduardo Luis <[email protected]>
@jecluis jecluis force-pushed the wip-wrap-container-entrypoint branch from fd35267 to df59833 Compare November 15, 2023 19:23
Copy link
Contributor

@tserong tserong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (I gave it a quick test too, both with and without --with-status)

@jecluis jecluis merged commit f6589aa into s3gw-tech:main Nov 16, 2023
2 checks passed
@jecluis jecluis deleted the wip-wrap-container-entrypoint branch November 16, 2023 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/containers area/docs Related with documentation in general kind/enhancement Change that positively impacts existing code priority/1 Should be fixed for next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

docs: explain how to use --rgw-dns-name to enable hostname style buckets
5 participants