-
Notifications
You must be signed in to change notification settings - Fork 176
chore: rename penguin -> nemo_gym and add the gym submodule #1587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis pull request systematically replaces all references to the "Penguin" integration with "NeMo-Gym" across the codebase, including adding a new NeMo-Gym submodule, updating packaging configuration, environment wrappers, rollout logic, distributed registry entries, training algorithms, examples, and test suites. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
nemo_rl/environments/nemo_gym.py (2)
106-139: Fixrun_rolloutsreturn annotation and guard timing metrics for empty batches.
run_rolloutsis annotated as returninglist[dict], but it actually returns(nemo_rl_results, timing_metrics). This will confuse type checkers and anyone reading the API, and may mask call-site mistakes.At the same time, computing
timing_metrics[f"{timer_prefix}/postprocess_results_pct"]will raiseKeyErrorif no rollouts were processed (empty iterator) and will divide by zero if_run_rollouts_totalends up as 0.Suggested change:
- async def run_rollouts( + async def run_rollouts( self, nemo_gym_examples: list[dict], tokenizer: PreTrainedTokenizerBase, timer_prefix: str, - ) -> list[dict]: + ) -> tuple[list[dict], dict[str, float]]: @@ - nemo_rl_results = [] + nemo_rl_results: list[dict] = [] @@ - timing_metrics = timer.get_timing_metrics("sum") - total_time = timing_metrics.pop("_run_rollouts_total") - timing_metrics[f"{timer_prefix}/postprocess_results_pct"] = ( - 100 * timing_metrics[f"{timer_prefix}/postprocess_results"] / total_time - ) + timing_metrics = timer.get_timing_metrics("sum") + total_time = timing_metrics.pop("_run_rollouts_total") + postprocess_key = f"{timer_prefix}/postprocess_results" + if postprocess_key in timing_metrics and total_time > 0: + timing_metrics[f"{timer_prefix}/postprocess_results_pct"] = ( + 100 * timing_metrics[postprocess_key] / total_time + )This matches the actual return shape and avoids surprising failures on edge cases.
217-227: Resolve unusedtokenizerparameter insetup_nemo_gym_config.Static analysis correctly flags
tokenizeras unused. If the API no longer needs it, you can drop the parameter (and update callsites). If you need to keep the signature for compatibility, make the unused nature explicit to keep linters quiet:-def setup_nemo_gym_config(config, tokenizer) -> None: - generation_config = config["policy"]["generation"] +def setup_nemo_gym_config(config, tokenizer) -> None: + # `tokenizer` is currently unused but kept for API compatibility with previous Penguin setup. + del tokenizer + generation_config = config["policy"]["generation"]Alternatively, rename to
_tokenizerand update callsites accordingly.
🧹 Nitpick comments (6)
nemo_rl/experience/rollouts.py (1)
1076-1076: Consider adding explicitstrict=parameter tozip().Per Python 3.10+ best practices and the static analysis hint, consider adding
strict=Trueto ensure both iterables have the same length.Apply this diff:
- for nemo_gym_row, result in zip(nemo_gym_rows, results): + for nemo_gym_row, result in zip(nemo_gym_rows, results, strict=True):3rdparty/NeMo-Gym-workspace/setup.py (1)
47-54: Pre-existing logic issue: existence check after file open.The
pyproject_toml_path.exists()check on line 51 occurs after the file is already opened on line 49. If the file doesn't exist, the code will fail at line 49 with aFileNotFoundErrorbefore reaching the explicit check. This is a pre-existing issue, not introduced by this PR.Consider reordering to check existence before opening:
if src_dir.exists(): pyproject_toml_path = src_dir / "pyproject.toml" + if not pyproject_toml_path.exists(): + raise FileNotFoundError( + f"[NeMo-Gym][setup] {pyproject_toml_path} not found; skipping dependency consistency check." + ) with pyproject_toml_path.open("rb") as f: pyproject_toml = tomllib.load(f) - if not pyproject_toml_path.exists(): - raise FileNotFoundError( - f"[NeMo-Gym][setup] {pyproject_toml_path} not found; skipping dependency consistency check." - )nemo_rl/environments/nemo_gym.py (4)
27-31: AlignNemoGymConfigwith actual usage ofinitial_global_config_dict.
initial_global_config_dictis declared as a required key but is accessed with.get(... ) or dict(), effectively treating it as optional. To keep typing and behavior consistent, and to follow the config-typing guidance, consider marking it as optional viaNotRequired:-from typing import Any, Dict, List, TypedDict +from typing import Any, Dict, List, NotRequired, TypedDict class NemoGymConfig(TypedDict): model_name: str base_urls: List[str] - initial_global_config_dict: Dict[str, Any] + initial_global_config_dict: NotRequired[Dict[str, Any]]You can then keep the existing
.get(... )usage but document the intended default in the relevant YAML config.
48-50: MakeRELATIVE_PATH/dotenv_pathconstruction more portable.The
RELATIVE_PATHstring and__file__.removesuffix(RELATIVE_PATH)approach assume POSIX-style separators and an exact trailing substring match. This will assert on Windows-style paths and is a bit brittle.Consider switching to
Patharithmetic instead, e.g.:- RELATIVE_PATH = "nemo_rl/environments/nemo_gym.py" - assert __file__.endswith(RELATIVE_PATH) + RELATIVE_PATH = Path("nemo_rl") / "environments" / "nemo_gym.py" + assert Path(__file__).as_posix().endswith(RELATIVE_PATH.as_posix()) ... - dotenv_path=Path(__file__.removesuffix(RELATIVE_PATH)).absolute() - / "nemo_gym_env.yaml", + dotenv_path=Path(__file__).resolve().parents[2] / "nemo_gym_env.yaml",(or an equivalent formulation based on your layout) to avoid string-based path surgery.
Also applies to: 88-91
140-193: Keepseen_token_idstrulyList[int]instead of extending with tensors.
seen_token_idsis annotated asList[int]but is extended withtorch.Tensorvalues:seen_token_ids.extend(nemo_rl_message_log[-2]["token_ids"]) seen_token_ids.extend(nemo_rl_message_log[-1]["token_ids"])This works at runtime but mixes scalar tensors with ints and will confuse type checkers. It also makes the list equality check against
prompt_token_idsrely on Tensor–int comparisons.Consider converting to Python ints:
- seen_token_ids.extend(nemo_rl_message_log[-2]["token_ids"]) - seen_token_ids.extend(nemo_rl_message_log[-1]["token_ids"]) + seen_token_ids.extend( + nemo_rl_message_log[-2]["token_ids"].tolist() + ) + seen_token_ids.extend( + nemo_rl_message_log[-1]["token_ids"].tolist() + )This keeps
seen_token_idsconsistently typed and aligns with the annotation.
203-209:step/global_post_process_and_metricsbeing unimplemented is acceptable but may merit docstrings.Raising
NotImplementedErrorwith comments clarifying that NeMo-Gym handles rollouts entirely is fine given this environment’s special role. If these methods are ever called via the genericEnvironmentInterface, consider adding short Google-style docstrings noting that they are intentionally unsupported forNemoGym.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
.gitmodules(1 hunks)3rdparty/NeMo-Gym-workspace/NeMo-Gym(1 hunks)3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py(1 hunks)3rdparty/NeMo-Gym-workspace/pyproject.toml(1 hunks)3rdparty/NeMo-Gym-workspace/setup.py(5 hunks)examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml(1 hunks)examples/nemo_gym/run_grpo_nemo_gym.py(7 hunks)examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh(2 hunks)nemo_rl/algorithms/grpo.py(6 hunks)nemo_rl/distributed/ray_actor_environment_registry.py(1 hunks)nemo_rl/distributed/virtual_cluster.py(1 hunks)nemo_rl/environments/nemo_gym.py(8 hunks)nemo_rl/experience/rollouts.py(6 hunks)nemo_rl/models/generation/vllm/vllm_worker_async.py(2 hunks)nemo_rl/utils/logger.py(1 hunks)pyproject.toml(3 hunks)tests/unit/environments/test_nemo_gym.py(5 hunks)tests/unit/experience/test_rollouts.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.sh
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.sh: Use uv run instead of python to execute scripts
Follow the Google Shell Style Guide for shell scripts
Files:
examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
!(**/tests/**|**/test_*.py|**/test_*.sh)
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year
Files:
examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.shnemo_rl/utils/logger.pynemo_rl/distributed/virtual_cluster.pynemo_rl/distributed/ray_actor_environment_registry.py3rdparty/NeMo-Gym-workspace/NeMo-Gymexamples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml3rdparty/NeMo-Gym-workspace/pyproject.toml.gitmodules3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py3rdparty/NeMo-Gym-workspace/setup.pynemo_rl/models/generation/vllm/vllm_worker_async.pynemo_rl/experience/rollouts.pynemo_rl/algorithms/grpo.pyexamples/nemo_gym/run_grpo_nemo_gym.pynemo_rl/environments/nemo_gym.pytests/unit/experience/test_rollouts.pypyproject.tomltests/unit/environments/test_nemo_gym.py
**/*.{py,sh}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)
Files:
examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.shnemo_rl/utils/logger.pynemo_rl/distributed/virtual_cluster.pynemo_rl/distributed/ray_actor_environment_registry.py3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py3rdparty/NeMo-Gym-workspace/setup.pynemo_rl/models/generation/vllm/vllm_worker_async.pynemo_rl/experience/rollouts.pynemo_rl/algorithms/grpo.pyexamples/nemo_gym/run_grpo_nemo_gym.pynemo_rl/environments/nemo_gym.pytests/unit/experience/test_rollouts.pytests/unit/environments/test_nemo_gym.py
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Conform code to Python 3.12+
Indent code with 4 spaces. Do not use tabs
Use snake_case for file names
Use PascalCase for class names
Use snake_case for function and method names
Use snake_case for local variables
Prefix variable names that start with a number with 'k' (e.g., k_99th_percentile)
Use upper snake_case with 'G' prefix for global variables (e.g., G_MY_GLOBAL)
Use upper snake_case for constants
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
Prefer docstrings over comments for interfaces that may be used outside a file
Reserve comments for code within a function or interfaces that are local to a file
If a piece of code is commented out, include a comment describing its usage and why it's commented out. Remove debug comments before merging
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx
Avoid using reflection when functionality can be easily achieved without reflection
When using try-except blocks, limit the except clause to the smallest set of specific errors possible
When using try-except blocks for duck-typing, keep the body of the try as small as possible and use the else block for logic
YAML is the single source of truth for configuration defaults. Do not set non-None defaults in code for configuration values
For required configuration attributes, access config directly and expect presence (e.g., policy_cfg['precision']) without hidden defaults
Use typing.NotRequired to mark optional attributes in TypedDict for configuration
When adding a new config key to a TypedDict subclass, document the key's purpose, valid values/types, and recommended default, and reflect the default in exemplar YAMLs under examples/configs/*.yaml
Follow the Google Python Style Guide for Python code
Files:
nemo_rl/utils/logger.pynemo_rl/distributed/virtual_cluster.pynemo_rl/distributed/ray_actor_environment_registry.py3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py3rdparty/NeMo-Gym-workspace/setup.pynemo_rl/models/generation/vllm/vllm_worker_async.pynemo_rl/experience/rollouts.pynemo_rl/algorithms/grpo.pyexamples/nemo_gym/run_grpo_nemo_gym.pynemo_rl/environments/nemo_gym.pytests/unit/experience/test_rollouts.pytests/unit/environments/test_nemo_gym.py
nemo_rl/**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
For any source file under nemo_rl/*.py that defines a class or function decorated with @ray.remote, add a coverage pragma (# pragma: no cover) because these run in separate Ray processes
Files:
nemo_rl/utils/logger.pynemo_rl/distributed/virtual_cluster.pynemo_rl/distributed/ray_actor_environment_registry.pynemo_rl/models/generation/vllm/vllm_worker_async.pynemo_rl/experience/rollouts.pynemo_rl/algorithms/grpo.pynemo_rl/environments/nemo_gym.py
🧠 Learnings (2)
📚 Learning: 2025-10-12T14:46:55.513Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:16-30
Timestamp: 2025-10-12T14:46:55.513Z
Learning: In the NVIDIA-NeMo/RL repository, test scripts under tests/ follow a consistent pattern: use `cd $PROJECT_ROOT` without quotes or error handling, and pass arguments with `$@` unquoted. Maintain this consistency when adding new test scripts.
Applied to files:
examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
📚 Learning: 2025-09-10T05:34:35.406Z
Learnt from: bxyu-nvidia
Repo: NVIDIA-NeMo/RL PR: 1110
File: nemo_rl/models/generation/vllm/vllm_worker_async.py:346-359
Timestamp: 2025-09-10T05:34:35.406Z
Learning: In nemo_rl/models/generation/vllm/vllm_worker_async.py, the HTTP server intentionally uses different path structures: `/v1/chat/completions` is under the `/v1` prefix while `/tokenize` is at the root level without the `/v1` prefix. This is the intended design.
Applied to files:
nemo_rl/models/generation/vllm/vllm_worker_async.py
🧬 Code graph analysis (6)
examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh (1)
tests/unit/environments/test_nemo_gym.py (1)
nemo_gym(82-132)
nemo_rl/distributed/ray_actor_environment_registry.py (1)
nemo_rl/distributed/virtual_cluster.py (1)
PY_EXECUTABLES(43-59)
nemo_rl/experience/rollouts.py (1)
nemo_rl/environments/nemo_gym.py (1)
run_rollouts(106-138)
nemo_rl/environments/nemo_gym.py (3)
nemo_rl/environments/interfaces.py (1)
EnvironmentInterface(52-88)nemo_rl/distributed/virtual_cluster.py (3)
_get_node_ip_local(67-71)_get_free_port_local(74-82)shutdown(477-496)nemo_rl/data/interfaces.py (1)
DatumSpec(32-40)
tests/unit/experience/test_rollouts.py (3)
nemo_rl/environments/nemo_gym.py (1)
nemo_gym_example_to_nemo_rl_datum_spec(235-250)nemo_rl/experience/rollouts.py (2)
run_async_multi_turn_rollout(780-935)run_async_nemo_gym_rollout(958-1145)nemo_rl/data/interfaces.py (1)
DatumSpec(32-40)
tests/unit/environments/test_nemo_gym.py (2)
nemo_rl/environments/nemo_gym.py (4)
NemoGym(34-209)NemoGymConfig(27-30)setup_nemo_gym_config(217-226)run_rollouts(106-138)nemo_rl/distributed/ray_actor_environment_registry.py (1)
get_actor_python_env(49-64)
🪛 Ruff (0.14.7)
3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py
15-15: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
18-18: Do not catch blind exception: Exception
(BLE001)
nemo_rl/experience/rollouts.py
1076-1076: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
nemo_rl/environments/nemo_gym.py
217-217: Unused function argument: tokenizer
(ARG001)
tests/unit/experience/test_rollouts.py
47-47: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
48-48: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
49-49: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
50-50: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
51-51: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
tests/unit/environments/test_nemo_gym.py
35-35: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
39-39: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Lint check
- GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (38)
nemo_rl/models/generation/vllm/vllm_worker_async.py (3)
53-53: ✓ Documentation update is appropriate.The docstring reference to "Penguin" has been correctly updated to "NeMo-Gym" to reflect the integration rename. The contextual meaning remains clear and accurate.
295-295: ✓ Comment update is consistent with the rename.The comment describing NeMo-Gym specific processing has been properly updated. The clarification about token information sourcing aligns with the documented intent.
126-128: ✓ Ray remote pragma correctly applied.The
@ray.remotedecorator onVllmAsyncGenerationWorkerhas the required# pragma: no coverannotation, consistent with coding guidelines for separate Ray processes.nemo_rl/utils/logger.py (1)
134-134: LGTM! Documentation updated to reflect the rename.The comment has been appropriately updated from "Penguin" to "NeMo-Gym", aligning with the broader repository-wide rename. The comment accurately describes the handling of non-compatible metrics.
.gitmodules (1)
16-20: The NeMo-Gym submodule URL is correct and the repository is accessible.The submodule configuration for the NeMo-Gym repository is valid and properly configured with the main branch and shallow clone enabled.
3rdparty/NeMo-Gym-workspace/NeMo-Gym (1)
1-1: Submodule reference is properly configured.The NeMo-Gym submodule is correctly registered in
.gitmoduleswith the appropriate path, URL, branch, and shallow flag. No NVIDIA copyright header is required for git submodule pointer files, as the coding guidelines apply only to Python files and shell scripts.pyproject.toml (1)
101-101: LGTM! Consistent rename from penguin to nemo_gym.The optional dependency group, uv source, and workspace member path have all been correctly updated to reflect the new NeMo-Gym naming.
Also applies to: 152-152, 175-175
3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py (1)
15-15: LGTM! Import and diagnostic message correctly updated.The import path and diagnostic string have been properly renamed from Penguin to NeMo-Gym.
Note: The static analysis hints flagging the
noqadirective and broad exception catch are false positives. Thenoqa: F401is necessary for this import-only check, and catching all exceptions is appropriate here to detect any import failure.Also applies to: 21-21
examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh (1)
4-4: LGTM! Consistent rename from penguin to nemo_gym.The uv sync extra, test module, and test function names have all been correctly updated to reflect the NeMo-Gym naming.
Also applies to: 30-30, 33-33
nemo_rl/distributed/ray_actor_environment_registry.py (1)
45-45: LGTM! Registry mapping correctly updated.The actor environment registry entry has been properly updated from the Penguin path to NemoGym, mapping to the correct PY_EXECUTABLES.NEMO_GYM executable.
examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml (1)
233-234: LGTM! Configuration correctly updated for NeMo-Gym.The data file paths, feature flags, and config section have all been consistently renamed from Penguin to NeMo-Gym terminology.
Also applies to: 239-241
nemo_rl/distributed/virtual_cluster.py (1)
58-59: LGTM! Python executable constant correctly updated.The NEMO_GYM executable reference has been properly renamed and updated to use the correct extra package flag.
3rdparty/NeMo-Gym-workspace/pyproject.toml (1)
6-6: LGTM! Package metadata correctly updated.The project name and description have been properly updated to reflect the NeMo-Gym naming.
Also applies to: 9-9
nemo_rl/experience/rollouts.py (2)
939-939: LGTM! Public API correctly renamed from Penguin to NeMo-Gym.The dataclass, function name, return type, and docstring have all been consistently updated to reflect the NeMo-Gym terminology.
Also applies to: 958-967
970-1010: LGTM! Internal variables and assertions correctly updated.All internal variable names (nemo_gym_rows, nemo_gym_environment) and assertion messages have been consistently renamed to use NeMo-Gym terminology.
3rdparty/NeMo-Gym-workspace/setup.py (3)
23-24: LGTM - Consistent rename to NeMo-Gym.The comment and source directory path are correctly updated to reference NeMo-Gym instead of Penguin.
44-44: New dependency added:psutil.Verify that
psutilis also present in the NeMo-Gym submodule'spyproject.tomlto maintain dependency consistency. The existing consistency check (lines 68-94) should catch any mismatch, but confirm this addition is intentional.
102-111: LGTM - Setup metadata correctly updated.Package name, description, and py_modules are consistently renamed to
nemo_gymandis_nemo_gym_installed.nemo_rl/algorithms/grpo.py (4)
57-61: LGTM - Import renamed correctly.The import is consistently renamed from
run_async_penguin_rollouttorun_async_nemo_gym_rollout.
1082-1097: LGTM - Training rollout block updated consistently.Variable names and function call are correctly renamed to NeMo-Gym equivalents. The result extraction (
input_ids,final_batch,rollout_metrics) is preserved.
1586-1600: LGTM - Validation rollout block updated consistently.Same pattern as training - function call and variable names correctly renamed.
851-871: LGTM - Helper function renamed with updated config key.The function
_should_use_nemo_gymis correctly implemented with:
- Updated docstring referencing NeMo-Gym
- Config key properly accessing
should_use_nemo_gymfrom the configuration- Local variable names using snake_case convention
- Clear error messages referencing the NeMo-Gym requirement
- YAML configs updated to use the new key (verified in
examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml)- Proper copyright header and style compliance
tests/unit/experience/test_rollouts.py (3)
35-40: LGTM - Imports correctly renamed.The imports are consistently updated to NeMo-Gym equivalents.
45-52: LGTM - Fixture imports renamed correctly.The fixture imports are correctly renamed to
nemo_gymvariants. The# noqa: F401directives are appropriate since these are pytest fixtures imported for dependency injection into test functions.Note: Static analysis hints flag these as "unused noqa directives" because the F401 rule appears to not be enabled in this project's Ruff configuration. The directives are harmless but could be removed if the team prefers cleaner code.
750-773: LGTM - Test function renamed with consistent fixture usage.The test function and all related fixtures/variables are correctly renamed to NeMo-Gym equivalents:
NEMO_GYM_INSTALLEDskip conditiontest_run_async_nemo_gym_rolloutfunction name- All fixture parameters renamed appropriately
- Internal variable names updated consistently
examples/nemo_gym/run_grpo_nemo_gym.py (4)
39-57: LGTM - Imports correctly updated to NeMo-Gym.All imports are consistently renamed to use NeMo-Gym equivalents from the updated module paths.
78-109: LGTM - Dataset setup function renamed with consistent variable names.Function
setup_single_nemo_gym_datasetand all internal variables (nemo_gym_examples,nemo_gym_example) are consistently renamed.
132-157: LGTM - Trajectory collection updated to use NeMo-Gym rollout.The
run_async_nemo_gym_rolloutcall and result variablenemo_gym_rollout_resultare correctly renamed.
198-267: LGTM - Main function configuration and environment setup updated.All NeMo-Gym related configuration and environment setup is correctly updated:
setup_nemo_gym_configcall (line 199)_should_use_nemo_gymassertion (line 202)- Config key access:
config["env"]["nemo_gym"](lines 250, 255)NemoGymConfigandNemoGymactor instantiation- Task environment mapping:
{"nemo_gym": nemo_gym}(line 266)tests/unit/environments/test_nemo_gym.py (8)
26-26: LGTM - Imports updated to NeMo-Gym equivalents.Correctly imports from the renamed module
nemo_rl.environments.nemo_gym.
35-35: LGTM - Tokenizer fixture correctly aliased.The tokenizer fixture is aliased as
nemo_gym_tokenizerfor clarity in this test module.
38-44: LGTM - Installation check updated to NeMo-Gym.The try/except block correctly checks for
nemo_gympackage installation and sets theNEMO_GYM_INSTALLEDflag.
47-54: LGTM - Stub module test renamed.Test function correctly renamed to
test_nemo_gym_stub_modulewith updated print message.
57-78: LGTM - vLLM generation fixture renamed.Fixture
nemo_gym_vllm_generationcorrectly usessetup_nemo_gym_configand dependency onnemo_gym_tokenizer.
81-132: LGTM - NeMo-Gym actor fixture correctly configured.The fixture properly creates a
NemoGymactor with:
NemoGymConfigconfiguration- Correct runtime environment path
nemo_rl.environments.nemo_gym.NemoGym- Health check before yielding
- Proper cleanup with shutdown and ray.kill
143-200: LGTM - Sanity test function renamed with consistent parameters.Test function
test_nemo_gym_sanitycorrectly uses renamed fixtures and variables throughout. The test logic remains functionally equivalent.
135-140: No action required—test data file exists at the specified path.The file
test_nemo_gym_sanity.jsonis present attests/unit/environments/nemo_gym_test_data/and the path construction in the fixture is correct. No old test data files remain.nemo_rl/environments/nemo_gym.py (1)
234-250:nemo_gym_example_to_nemo_rl_datum_specmapping looks consistent withDatumSpec.The helper correctly wraps the raw
nemo_gym_exampleinextra_env_info, assignsidx, and tagstask_name="nemo_gym", matching the DatumSpec shape fromnemo_rl.data.interfaces. The extratoken_idskey is a pragmatic compatibility shim with current GRPO code.
c3c1ef5 to
c1f272a
Compare
Signed-off-by: Terry Kong <[email protected]>
Signed-off-by: Terry Kong <[email protected]>
Signed-off-by: Terry Kong <[email protected]>
Signed-off-by: Terry Kong <[email protected]>
c1f272a to
694b4b7
Compare
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Issues
List issues that this PR closes (syntax):
closes NVIDIA-NeMo/Gym#362
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit
New Features
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.