-
Notifications
You must be signed in to change notification settings - Fork 71
Updated internal dev guidelines #2296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Updated internal dev guidelines #2296
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR updates the internal development guidelines to reflect changes in the ray_run.py command-line interface parameters. The documentation now uses the current parameter syntax and adds a helpful command for viewing job logs.
Key Changes
- Updated
ray_run.pycommand to use--clusterparameter and shortened-eflag for environment variables - Added
ray job logscommand to help developers view logs for specific jobs - Updated the path to the example script from
experiments/hello_world.pytoexperiments/tutorials/hello_world.py
rjpower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the cleanup!
| uv run scripts/ray/cluster.py --config infra/marin-us-central1.yaml list-jobs | ||
| # Get Job Logs for a specific job | ||
| ray job logs --address "http://127.0.0.1:8265" <JOB_ID> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this, let's make sure the user know they should connect to the dashboard first:
# connect to dashboard
# `uv run ...cluster.py dashboard`
ray job logs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to this!
# Get Job Logs for a specific job
# Ensure that the dashboard for the correct cluster is running (run this in another terminal)
# > uv run scripts/ray/cluster.py --config infra/marin-us-central1.yaml dashboard
ray job logs --address "http://127.0.0.1:8265" <JOB_ID>Thanks for the suggestion!
b0ec6aa to
0f8ea1c
Compare
0f8ea1c to
e08dae0
Compare
Description
While I was onboarding to Marin I noticed that some of the parameters to ray_run have changed. I updated the commands in the tutorial to reflect the changes I observed.
Before:
uv run lib/marin/src/marin/run/ray_run.py --no_wait --env_vars WANDB_API_KEY=${WANDB_API_KEY} -- python experiments/hello_world.pyAfter
uv run lib/marin/src/marin/run/ray_run.py --cluster infra/marin-us-central1.yaml --no_wait -e WANDB_API_KEY ${WANDB_API_KEY} -- python experiments/tutorials/hello_world.pyFor testing I also personally found it valuable to view the logs for a run so I added this line to the tutorial:
I would appreciate if a maintainer more familiar with the cluster/ray would double check that this is the recommended way of doing this!