-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[sglang] feat: retires sglang spmd mode in the codebase #4422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully retires the legacy SGLang SPMD mode, streamlining the codebase to exclusively use the async/server mode. The SGLangRollout class, along with its associated helper functions, tests, and configuration references, has been removed. This is a positive and significant refactoring effort that enhances maintainability and clarifies the supported SGLang backend. The changes are consistently applied across example scripts, test files, and core implementation, ensuring a clean transition to the new architecture.
|
Previously retire vllm: #4411 |
) Retires the legacy SGLang SPMD rollout path and makes async/server mode the only supported backend for SGLang. The PR removes the old `SGLangRollout` class, its helpers, tests, and recipes, and updates all docs, scripts, and CI references so they speak only to the async HTTP adapter (`ServerAdapter`). - [ ] Search for similar PRs. Paste at least one query link here: _N/A (follow-up to the vLLM SPMD removal)._ - [ ] Format the PR title as `[sglang, rollout, trainer, recipe, ci, doc] refactor: remove SGLang SPMD rollout` Same as vLLM: configs/scripts must set `actor_rollout_ref.rollout.mode=async` and rely on the HTTP server. Example: ```bash python -m verl.trainer.main_ppo \ ... \ actor_rollout_ref.rollout.name=sglang \ actor_rollout_ref.rollout.mode=async \ ... ``` - Deleted the `SGLangRollout` class and associated helpers from `verl/workers/rollout/sglang_rollout/sglang_rollout.py`, keeping only the async `ServerAdapter`. Cleared its registry entries, configs, and guards the same way as the vLLM PR. - Removed SGLang SPMD-specific tests (`tests/workers/rollout/test_sglang_*`) and CI steps in `.github/workflows/sgl.yml`, plus any lint exclusions that referenced those files. - Updated recipes/examples/e2e scripts that referenced SGLang rollout to hardcode `rollout.mode=async`, drop sync branches, and set `return_raw_chat` (mirroring the vLLM cleanup). - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting). - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: _Removed the obsolete SGLang SPMD jobs; async workflows remain covered._ - [ ] Once your PR is ready for CI, notify the `ci-request` channel (or Feishu group).
) Retires the legacy SGLang SPMD rollout path and makes async/server mode the only supported backend for SGLang. The PR removes the old `SGLangRollout` class, its helpers, tests, and recipes, and updates all docs, scripts, and CI references so they speak only to the async HTTP adapter (`ServerAdapter`). - [ ] Search for similar PRs. Paste at least one query link here: _N/A (follow-up to the vLLM SPMD removal)._ - [ ] Format the PR title as `[sglang, rollout, trainer, recipe, ci, doc] refactor: remove SGLang SPMD rollout` Same as vLLM: configs/scripts must set `actor_rollout_ref.rollout.mode=async` and rely on the HTTP server. Example: ```bash python -m verl.trainer.main_ppo \ ... \ actor_rollout_ref.rollout.name=sglang \ actor_rollout_ref.rollout.mode=async \ ... ``` - Deleted the `SGLangRollout` class and associated helpers from `verl/workers/rollout/sglang_rollout/sglang_rollout.py`, keeping only the async `ServerAdapter`. Cleared its registry entries, configs, and guards the same way as the vLLM PR. - Removed SGLang SPMD-specific tests (`tests/workers/rollout/test_sglang_*`) and CI steps in `.github/workflows/sgl.yml`, plus any lint exclusions that referenced those files. - Updated recipes/examples/e2e scripts that referenced SGLang rollout to hardcode `rollout.mode=async`, drop sync branches, and set `return_raw_chat` (mirroring the vLLM cleanup). - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting). - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: _Removed the obsolete SGLang SPMD jobs; async workflows remain covered._ - [ ] Once your PR is ready for CI, notify the `ci-request` channel (or Feishu group).
What does this PR do?
Retires the legacy SGLang SPMD rollout path and makes async/server mode the only supported backend for SGLang. The PR removes the old
SGLangRolloutclass, its helpers, tests, and recipes, and updates all docs, scripts, and CI references so they speak only to the async HTTP adapter (ServerAdapter).Checklist Before Starting
[sglang, rollout, trainer, recipe, ci, doc] refactor: remove SGLang SPMD rolloutTest
API and Usage Example
Same as vLLM: configs/scripts must set
actor_rollout_ref.rollout.mode=asyncand rely on the HTTP server. Example:Design & Code Changes
SGLangRolloutclass and associated helpers fromverl/workers/rollout/sglang_rollout/sglang_rollout.py, keeping only the asyncServerAdapter. Cleared its registry entries, configs, and guards the same way as the vLLM PR.tests/workers/rollout/test_sglang_*) and CI steps in.github/workflows/sgl.yml, plus any lint exclusions that referenced those files.rollout.mode=async, drop sync branches, and setreturn_raw_chat(mirroring the vLLM cleanup).Checklist Before Submitting
ci-requestchannel (or Feishu group).