-
Notifications
You must be signed in to change notification settings - Fork 72
Mount GCS via single host gcsfuse and expose in container #2294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Run gcsfuse once on the host at /tmp/gcsfuse_mount - Expose mount inside container at /opt/gcsfuse_mount via symlink - Avoid Docker bind-mount source-path creation failures - Enable allow_other + permissive modes so container can read FUSE mount - Thread --verbose through to Ray CLI (ray up/down/attach/exec -v)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the GCS mount strategy to use a single host-level gcsfuse mount and propagates verbose flags through Ray CLI commands.
Key changes:
- Moves gcsfuse mounting from per-container setup to host-level initialization at
/tmp/gcsfuse_mount - Adds FUSE configuration with
allow_otherand permissive modes to enable container access - Implements verbose flag propagation from cluster.py CLI to Ray commands (up/down/attach/exec)
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/ray/cluster.py | Added _maybe_add_ray_verbose helper function and updated all Ray CLI invocations to support verbose flag passthrough; updated _stop_cluster_internal signature to accept Context |
| infra/marin-us-west4.yaml | Added gcsfuse installation and host-level mount in initialization_commands; replaced direct mount with symlink approach in setup_commands |
| infra/marin-us-east5.yaml | Same gcsfuse changes as us-west4 for consistency across regions |
| infra/marin-us-east5-a.yaml | Same gcsfuse changes as us-west4 for consistency across regions |
| infra/marin-us-east1.yaml | Same gcsfuse changes as us-west4 for consistency across regions |
| infra/marin-us-central2.yaml | Same gcsfuse changes as us-west4 for consistency across regions |
| infra/marin-us-central2-staging.yaml | Same gcsfuse changes as us-west4 for staging environment |
| infra/marin-us-central1.yaml | Same gcsfuse changes as us-west4 for consistency across regions |
| infra/marin-eu-west4.yaml | Same gcsfuse changes as us-west4 for EU region |
| infra/marin-eu-west4-a.yaml | Same gcsfuse changes as us-west4 for EU region |
| infra/marin-cluster-template.yaml | Template file updated with gcsfuse changes to propagate to future cluster configurations |
| infra/marin-big-run.yaml | Same gcsfuse changes as us-west4 for big-run cluster |
| - if [ -e /opt/gcsfuse_mount ] && [ ! -L /opt/gcsfuse_mount ]; then sudo rm -rf /opt/gcsfuse_mount; fi | ||
| - sudo ln -sfn /tmp/gcsfuse_mount /opt/gcsfuse_mount |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The symlink created at /opt/gcsfuse_mount will not be accessible inside the Docker container because /opt is not mounted into the container. Only /tmp is mounted (line 48). For the mount to be accessible at /opt/gcsfuse_mount inside the container, you need to add a volume mount like "-v /opt:/opt" to both head_run_options and worker_run_options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lies
lrwxrwxrwx 1 root root 18 Jan 6 22:37 /opt/gcsfuse_mount -> /tmp/gcsfuse_mount
bash: gcsfuse_mount: command not found
a
dedupe
gcsfuse_mount
helmet-data
huggingface-cache
marin-us-central2
medu-models
models
nfliu
nvidia--Llama-Nemotron-Post-Training-Dataset-v1-ed905e6
rjpower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine, not sure why we use /tmp/ outside but /opt inside but meh
Part of putting vllm tpu in docker