Skip to content

Conversation

@terrykong
Copy link
Contributor

@terrykong terrykong commented Dec 2, 2025

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Issues

List issues that this PR closes (syntax):

closes NVIDIA-NeMo/Gym#362

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features

    • Added NeMo-Gym environment integration for reinforcement learning workflows.
  • Refactor

    • Migrated system configuration and dependencies from Penguin to NeMo-Gym naming conventions.
    • Updated environment settings and parameters to reflect new naming scheme.
  • Tests

    • Updated test suite to support NeMo-Gym integration.

✏️ Tip: You can customize this high-level summary in your review settings.

@terrykong terrykong requested review from a team as code owners December 2, 2025 07:48
@github-actions
Copy link

github-actions bot commented Dec 2, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: 45eca2f (PR #1587 from tk/gym-rename)

✅ Submodules that are properly updated:

NeMo-Gym: ✅ New submodule being added

All submodule changes look good! ✨

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

📝 Walkthrough

Walkthrough

This pull request systematically replaces all references to the "Penguin" integration with "NeMo-Gym" across the codebase, including adding a new NeMo-Gym submodule, updating packaging configuration, environment wrappers, rollout logic, distributed registry entries, training algorithms, examples, and test suites.

Changes

Cohort / File(s) Summary
Submodule integration
.gitmodules, 3rdparty/NeMo-Gym-workspace/NeMo-Gym
Adds new NeMo-Gym submodule entry pointing to NVIDIA-NeMo/Gym repository with shallow clone enabled.
Workspace packaging and setup
3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py, pyproject.toml, setup.py
Updates workspace project metadata and setup logic from Penguin to NeMo-Gym; adds psutil to dependencies; updates module name and diagnostic strings.
Core environment and rollout APIs
nemo_rl/environments/nemo_gym.py, nemo_rl/experience/rollouts.py
Renames public classes and functions: PenguinNemoGym, PenguinConfigNemoGymConfig, run_async_penguin_rolloutrun_async_nemo_gym_rollout, AsyncPenguinRolloutResultAsyncNemoGymRolloutResult; updates all internal variable and method names accordingly.
Algorithm and training integration
nemo_rl/algorithms/grpo.py
Replaces Penguin rollout path with NeMo-Gym: renames _should_use_penguin_should_use_nemo_gym, updates import and call sites for rollout functions, updates error messages and assertions.
Distributed system registry
nemo_rl/distributed/ray_actor_environment_registry.py, virtual_cluster.py
Updates actor environment registry mapping from nemo_rl.environments.penguin.Penguin to nemo_rl.environments.nemo_gym.NemoGym; replaces PY_EXECUTABLES.PENGUIN with PY_EXECUTABLES.NEMO_GYM.
Root configuration and dependencies
pyproject.toml
Renames optional dependency group from penguin to nemo_gym; updates workspace members and uv source references.
Examples and configurations
examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml, run_grpo_nemo_gym.py, run_nemo_gym_single_node_sanity_tests.sh
Updates configuration keys (should_use_penguinshould_use_nemo_gym), imports, function calls, and test invocation names; updates data file paths and environment variable names.
Logging and documentation
nemo_rl/models/generation/vllm/vllm_worker_async.py, nemo_rl/utils/logger.py
Updates comments and docstrings to reference NeMo-Gym instead of Penguin; no functional logic changes.
Unit and integration tests
tests/unit/environments/test_nemo_gym.py, tests/unit/experience/test_rollouts.py
Renames test fixtures, constants, and functions: PENGUIN_INSTALLEDNEMO_GYM_INSTALLED, test_penguin_sanitytest_nemo_gym_sanity, test_run_async_penguin_rollouttest_run_async_nemo_gym_rollout; updates import paths and test data references.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas requiring extra attention:

  • Distributed registry consistency: Verify that ray_actor_environment_registry.py and virtual_cluster.py are correctly synchronized with the renamed class and executable mappings.
  • Public API surface: Review all renamed functions and classes (e.g., run_async_nemo_gym_rollout, NemoGymConfig) to ensure imports and call sites are updated consistently across multiple integration points.
  • Submodule references: Confirm that workspace paths in setup.py and configuration files correctly point to NeMo-Gym-workspace/NeMo-Gym/ structure.
  • Test data and fixtures: Verify that test fixtures (e.g., nemo_gym_sanity_test_data, nemo_gym_tokenizer) correctly resolve and that skip conditions are properly updated.

Possibly related PRs

  • feat: Integrate Penguin env logic #1450: Directly modifies the same environment/rollout integration code by renaming Penguin entities to NeMo-Gym counterparts across rollouts, grpo, and environment classes.
  • feat: Add Penguin env #1327: Systematically renames and rewires the Penguin integration by replacing the same modules, classes, functions, and registry entries with NeMo-Gym equivalents.
  • feat: Add Penguin run #1481: Shares modifications to environment and rollout codepaths; this PR renames Penguin symbols to NeMo-Gym while the related PR adds Penguin run functionality to the same modules.

Suggested labels

documentation, CI:L1

Suggested reviewers

  • parthchadha
  • guyueh1

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.83% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Test Results For Major Changes ⚠️ Warning PR introduces major breaking API changes (class/function/import renames) across 18+ files but provides no test execution results or verification in the description. Add test execution results, sanity test outputs, and regression verification to confirm all tests pass and no functionality regressions occurred.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and accurately summarizes the main changes: renaming penguin to nemo_gym throughout the codebase and adding the gym submodule, which aligns with the comprehensive file-level changes documented in the summary.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch tk/gym-rename

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
nemo_rl/environments/nemo_gym.py (2)

106-139: Fix run_rollouts return annotation and guard timing metrics for empty batches.

run_rollouts is annotated as returning list[dict], but it actually returns (nemo_rl_results, timing_metrics). This will confuse type checkers and anyone reading the API, and may mask call-site mistakes.

At the same time, computing timing_metrics[f"{timer_prefix}/postprocess_results_pct"] will raise KeyError if no rollouts were processed (empty iterator) and will divide by zero if _run_rollouts_total ends up as 0.

Suggested change:

-    async def run_rollouts(
+    async def run_rollouts(
         self,
         nemo_gym_examples: list[dict],
         tokenizer: PreTrainedTokenizerBase,
         timer_prefix: str,
-    ) -> list[dict]:
+    ) -> tuple[list[dict], dict[str, float]]:
@@
-        nemo_rl_results = []
+        nemo_rl_results: list[dict] = []
@@
-        timing_metrics = timer.get_timing_metrics("sum")
-        total_time = timing_metrics.pop("_run_rollouts_total")
-        timing_metrics[f"{timer_prefix}/postprocess_results_pct"] = (
-            100 * timing_metrics[f"{timer_prefix}/postprocess_results"] / total_time
-        )
+        timing_metrics = timer.get_timing_metrics("sum")
+        total_time = timing_metrics.pop("_run_rollouts_total")
+        postprocess_key = f"{timer_prefix}/postprocess_results"
+        if postprocess_key in timing_metrics and total_time > 0:
+            timing_metrics[f"{timer_prefix}/postprocess_results_pct"] = (
+                100 * timing_metrics[postprocess_key] / total_time
+            )

This matches the actual return shape and avoids surprising failures on edge cases.


217-227: Resolve unused tokenizer parameter in setup_nemo_gym_config.

Static analysis correctly flags tokenizer as unused. If the API no longer needs it, you can drop the parameter (and update callsites). If you need to keep the signature for compatibility, make the unused nature explicit to keep linters quiet:

-def setup_nemo_gym_config(config, tokenizer) -> None:
-    generation_config = config["policy"]["generation"]
+def setup_nemo_gym_config(config, tokenizer) -> None:
+    # `tokenizer` is currently unused but kept for API compatibility with previous Penguin setup.
+    del tokenizer
+    generation_config = config["policy"]["generation"]

Alternatively, rename to _tokenizer and update callsites accordingly.

🧹 Nitpick comments (6)
nemo_rl/experience/rollouts.py (1)

1076-1076: Consider adding explicit strict= parameter to zip().

Per Python 3.10+ best practices and the static analysis hint, consider adding strict=True to ensure both iterables have the same length.

Apply this diff:

-        for nemo_gym_row, result in zip(nemo_gym_rows, results):
+        for nemo_gym_row, result in zip(nemo_gym_rows, results, strict=True):
3rdparty/NeMo-Gym-workspace/setup.py (1)

47-54: Pre-existing logic issue: existence check after file open.

The pyproject_toml_path.exists() check on line 51 occurs after the file is already opened on line 49. If the file doesn't exist, the code will fail at line 49 with a FileNotFoundError before reaching the explicit check. This is a pre-existing issue, not introduced by this PR.

Consider reordering to check existence before opening:

 if src_dir.exists():
     pyproject_toml_path = src_dir / "pyproject.toml"
+    if not pyproject_toml_path.exists():
+        raise FileNotFoundError(
+            f"[NeMo-Gym][setup] {pyproject_toml_path} not found; skipping dependency consistency check."
+        )
     with pyproject_toml_path.open("rb") as f:
         pyproject_toml = tomllib.load(f)
-    if not pyproject_toml_path.exists():
-        raise FileNotFoundError(
-            f"[NeMo-Gym][setup] {pyproject_toml_path} not found; skipping dependency consistency check."
-        )
nemo_rl/environments/nemo_gym.py (4)

27-31: Align NemoGymConfig with actual usage of initial_global_config_dict.

initial_global_config_dict is declared as a required key but is accessed with .get(... ) or dict(), effectively treating it as optional. To keep typing and behavior consistent, and to follow the config-typing guidance, consider marking it as optional via NotRequired:

-from typing import Any, Dict, List, TypedDict
+from typing import Any, Dict, List, NotRequired, TypedDict

 class NemoGymConfig(TypedDict):
     model_name: str
     base_urls: List[str]
-    initial_global_config_dict: Dict[str, Any]
+    initial_global_config_dict: NotRequired[Dict[str, Any]]

You can then keep the existing .get(... ) usage but document the intended default in the relevant YAML config.


48-50: Make RELATIVE_PATH / dotenv_path construction more portable.

The RELATIVE_PATH string and __file__.removesuffix(RELATIVE_PATH) approach assume POSIX-style separators and an exact trailing substring match. This will assert on Windows-style paths and is a bit brittle.

Consider switching to Path arithmetic instead, e.g.:

-        RELATIVE_PATH = "nemo_rl/environments/nemo_gym.py"
-        assert __file__.endswith(RELATIVE_PATH)
+        RELATIVE_PATH = Path("nemo_rl") / "environments" / "nemo_gym.py"
+        assert Path(__file__).as_posix().endswith(RELATIVE_PATH.as_posix())
...
-                dotenv_path=Path(__file__.removesuffix(RELATIVE_PATH)).absolute()
-                / "nemo_gym_env.yaml",
+                dotenv_path=Path(__file__).resolve().parents[2] / "nemo_gym_env.yaml",

(or an equivalent formulation based on your layout) to avoid string-based path surgery.

Also applies to: 88-91


140-193: Keep seen_token_ids truly List[int] instead of extending with tensors.

seen_token_ids is annotated as List[int] but is extended with torch.Tensor values:

seen_token_ids.extend(nemo_rl_message_log[-2]["token_ids"])
seen_token_ids.extend(nemo_rl_message_log[-1]["token_ids"])

This works at runtime but mixes scalar tensors with ints and will confuse type checkers. It also makes the list equality check against prompt_token_ids rely on Tensor–int comparisons.

Consider converting to Python ints:

-            seen_token_ids.extend(nemo_rl_message_log[-2]["token_ids"])
-            seen_token_ids.extend(nemo_rl_message_log[-1]["token_ids"])
+            seen_token_ids.extend(
+                nemo_rl_message_log[-2]["token_ids"].tolist()
+            )
+            seen_token_ids.extend(
+                nemo_rl_message_log[-1]["token_ids"].tolist()
+            )

This keeps seen_token_ids consistently typed and aligns with the annotation.


203-209: step / global_post_process_and_metrics being unimplemented is acceptable but may merit docstrings.

Raising NotImplementedError with comments clarifying that NeMo-Gym handles rollouts entirely is fine given this environment’s special role. If these methods are ever called via the generic EnvironmentInterface, consider adding short Google-style docstrings noting that they are intentionally unsupported for NemoGym.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3817189 and 45eca2f.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • .gitmodules (1 hunks)
  • 3rdparty/NeMo-Gym-workspace/NeMo-Gym (1 hunks)
  • 3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py (1 hunks)
  • 3rdparty/NeMo-Gym-workspace/pyproject.toml (1 hunks)
  • 3rdparty/NeMo-Gym-workspace/setup.py (5 hunks)
  • examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml (1 hunks)
  • examples/nemo_gym/run_grpo_nemo_gym.py (7 hunks)
  • examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh (2 hunks)
  • nemo_rl/algorithms/grpo.py (6 hunks)
  • nemo_rl/distributed/ray_actor_environment_registry.py (1 hunks)
  • nemo_rl/distributed/virtual_cluster.py (1 hunks)
  • nemo_rl/environments/nemo_gym.py (8 hunks)
  • nemo_rl/experience/rollouts.py (6 hunks)
  • nemo_rl/models/generation/vllm/vllm_worker_async.py (2 hunks)
  • nemo_rl/utils/logger.py (1 hunks)
  • pyproject.toml (3 hunks)
  • tests/unit/environments/test_nemo_gym.py (5 hunks)
  • tests/unit/experience/test_rollouts.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Use uv run instead of python to execute scripts
Follow the Google Shell Style Guide for shell scripts

Files:

  • examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
!(**/tests/**|**/test_*.py|**/test_*.sh)

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year

Files:

  • examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
  • nemo_rl/utils/logger.py
  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
  • 3rdparty/NeMo-Gym-workspace/NeMo-Gym
  • examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml
  • 3rdparty/NeMo-Gym-workspace/pyproject.toml
  • .gitmodules
  • 3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py
  • 3rdparty/NeMo-Gym-workspace/setup.py
  • nemo_rl/models/generation/vllm/vllm_worker_async.py
  • nemo_rl/experience/rollouts.py
  • nemo_rl/algorithms/grpo.py
  • examples/nemo_gym/run_grpo_nemo_gym.py
  • nemo_rl/environments/nemo_gym.py
  • tests/unit/experience/test_rollouts.py
  • pyproject.toml
  • tests/unit/environments/test_nemo_gym.py
**/*.{py,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)

Files:

  • examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
  • nemo_rl/utils/logger.py
  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
  • 3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py
  • 3rdparty/NeMo-Gym-workspace/setup.py
  • nemo_rl/models/generation/vllm/vllm_worker_async.py
  • nemo_rl/experience/rollouts.py
  • nemo_rl/algorithms/grpo.py
  • examples/nemo_gym/run_grpo_nemo_gym.py
  • nemo_rl/environments/nemo_gym.py
  • tests/unit/experience/test_rollouts.py
  • tests/unit/environments/test_nemo_gym.py
**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Conform code to Python 3.12+
Indent code with 4 spaces. Do not use tabs
Use snake_case for file names
Use PascalCase for class names
Use snake_case for function and method names
Use snake_case for local variables
Prefix variable names that start with a number with 'k' (e.g., k_99th_percentile)
Use upper snake_case with 'G' prefix for global variables (e.g., G_MY_GLOBAL)
Use upper snake_case for constants
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
Prefer docstrings over comments for interfaces that may be used outside a file
Reserve comments for code within a function or interfaces that are local to a file
If a piece of code is commented out, include a comment describing its usage and why it's commented out. Remove debug comments before merging
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx
Avoid using reflection when functionality can be easily achieved without reflection
When using try-except blocks, limit the except clause to the smallest set of specific errors possible
When using try-except blocks for duck-typing, keep the body of the try as small as possible and use the else block for logic
YAML is the single source of truth for configuration defaults. Do not set non-None defaults in code for configuration values
For required configuration attributes, access config directly and expect presence (e.g., policy_cfg['precision']) without hidden defaults
Use typing.NotRequired to mark optional attributes in TypedDict for configuration
When adding a new config key to a TypedDict subclass, document the key's purpose, valid values/types, and recommended default, and reflect the default in exemplar YAMLs under examples/configs/*.yaml
Follow the Google Python Style Guide for Python code

Files:

  • nemo_rl/utils/logger.py
  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
  • 3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py
  • 3rdparty/NeMo-Gym-workspace/setup.py
  • nemo_rl/models/generation/vllm/vllm_worker_async.py
  • nemo_rl/experience/rollouts.py
  • nemo_rl/algorithms/grpo.py
  • examples/nemo_gym/run_grpo_nemo_gym.py
  • nemo_rl/environments/nemo_gym.py
  • tests/unit/experience/test_rollouts.py
  • tests/unit/environments/test_nemo_gym.py
nemo_rl/**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

For any source file under nemo_rl/*.py that defines a class or function decorated with @ray.remote, add a coverage pragma (# pragma: no cover) because these run in separate Ray processes

Files:

  • nemo_rl/utils/logger.py
  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
  • nemo_rl/models/generation/vllm/vllm_worker_async.py
  • nemo_rl/experience/rollouts.py
  • nemo_rl/algorithms/grpo.py
  • nemo_rl/environments/nemo_gym.py
🧠 Learnings (2)
📚 Learning: 2025-10-12T14:46:55.513Z
Learnt from: zpqiu
Repo: NVIDIA-NeMo/RL PR: 1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:16-30
Timestamp: 2025-10-12T14:46:55.513Z
Learning: In the NVIDIA-NeMo/RL repository, test scripts under tests/ follow a consistent pattern: use `cd $PROJECT_ROOT` without quotes or error handling, and pass arguments with `$@` unquoted. Maintain this consistency when adding new test scripts.

Applied to files:

  • examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh
📚 Learning: 2025-09-10T05:34:35.406Z
Learnt from: bxyu-nvidia
Repo: NVIDIA-NeMo/RL PR: 1110
File: nemo_rl/models/generation/vllm/vllm_worker_async.py:346-359
Timestamp: 2025-09-10T05:34:35.406Z
Learning: In nemo_rl/models/generation/vllm/vllm_worker_async.py, the HTTP server intentionally uses different path structures: `/v1/chat/completions` is under the `/v1` prefix while `/tokenize` is at the root level without the `/v1` prefix. This is the intended design.

Applied to files:

  • nemo_rl/models/generation/vllm/vllm_worker_async.py
🧬 Code graph analysis (6)
examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh (1)
tests/unit/environments/test_nemo_gym.py (1)
  • nemo_gym (82-132)
nemo_rl/distributed/ray_actor_environment_registry.py (1)
nemo_rl/distributed/virtual_cluster.py (1)
  • PY_EXECUTABLES (43-59)
nemo_rl/experience/rollouts.py (1)
nemo_rl/environments/nemo_gym.py (1)
  • run_rollouts (106-138)
nemo_rl/environments/nemo_gym.py (3)
nemo_rl/environments/interfaces.py (1)
  • EnvironmentInterface (52-88)
nemo_rl/distributed/virtual_cluster.py (3)
  • _get_node_ip_local (67-71)
  • _get_free_port_local (74-82)
  • shutdown (477-496)
nemo_rl/data/interfaces.py (1)
  • DatumSpec (32-40)
tests/unit/experience/test_rollouts.py (3)
nemo_rl/environments/nemo_gym.py (1)
  • nemo_gym_example_to_nemo_rl_datum_spec (235-250)
nemo_rl/experience/rollouts.py (2)
  • run_async_multi_turn_rollout (780-935)
  • run_async_nemo_gym_rollout (958-1145)
nemo_rl/data/interfaces.py (1)
  • DatumSpec (32-40)
tests/unit/environments/test_nemo_gym.py (2)
nemo_rl/environments/nemo_gym.py (4)
  • NemoGym (34-209)
  • NemoGymConfig (27-30)
  • setup_nemo_gym_config (217-226)
  • run_rollouts (106-138)
nemo_rl/distributed/ray_actor_environment_registry.py (1)
  • get_actor_python_env (49-64)
🪛 Ruff (0.14.7)
3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py

15-15: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


18-18: Do not catch blind exception: Exception

(BLE001)

nemo_rl/experience/rollouts.py

1076-1076: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

nemo_rl/environments/nemo_gym.py

217-217: Unused function argument: tokenizer

(ARG001)

tests/unit/experience/test_rollouts.py

47-47: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


48-48: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


49-49: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


50-50: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


51-51: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)

tests/unit/environments/test_nemo_gym.py

35-35: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)


39-39: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Lint check
  • GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (38)
nemo_rl/models/generation/vllm/vllm_worker_async.py (3)

53-53: ✓ Documentation update is appropriate.

The docstring reference to "Penguin" has been correctly updated to "NeMo-Gym" to reflect the integration rename. The contextual meaning remains clear and accurate.


295-295: ✓ Comment update is consistent with the rename.

The comment describing NeMo-Gym specific processing has been properly updated. The clarification about token information sourcing aligns with the documented intent.


126-128: ✓ Ray remote pragma correctly applied.

The @ray.remote decorator on VllmAsyncGenerationWorker has the required # pragma: no cover annotation, consistent with coding guidelines for separate Ray processes.

nemo_rl/utils/logger.py (1)

134-134: LGTM! Documentation updated to reflect the rename.

The comment has been appropriately updated from "Penguin" to "NeMo-Gym", aligning with the broader repository-wide rename. The comment accurately describes the handling of non-compatible metrics.

.gitmodules (1)

16-20: The NeMo-Gym submodule URL is correct and the repository is accessible.

The submodule configuration for the NeMo-Gym repository is valid and properly configured with the main branch and shallow clone enabled.

3rdparty/NeMo-Gym-workspace/NeMo-Gym (1)

1-1: Submodule reference is properly configured.

The NeMo-Gym submodule is correctly registered in .gitmodules with the appropriate path, URL, branch, and shallow flag. No NVIDIA copyright header is required for git submodule pointer files, as the coding guidelines apply only to Python files and shell scripts.

pyproject.toml (1)

101-101: LGTM! Consistent rename from penguin to nemo_gym.

The optional dependency group, uv source, and workspace member path have all been correctly updated to reflect the new NeMo-Gym naming.

Also applies to: 152-152, 175-175

3rdparty/NeMo-Gym-workspace/is_nemo_gym_installed.py (1)

15-15: LGTM! Import and diagnostic message correctly updated.

The import path and diagnostic string have been properly renamed from Penguin to NeMo-Gym.

Note: The static analysis hints flagging the noqa directive and broad exception catch are false positives. The noqa: F401 is necessary for this import-only check, and catching all exceptions is appropriate here to detect any import failure.

Also applies to: 21-21

examples/nemo_gym/run_nemo_gym_single_node_sanity_tests.sh (1)

4-4: LGTM! Consistent rename from penguin to nemo_gym.

The uv sync extra, test module, and test function names have all been correctly updated to reflect the NeMo-Gym naming.

Also applies to: 30-30, 33-33

nemo_rl/distributed/ray_actor_environment_registry.py (1)

45-45: LGTM! Registry mapping correctly updated.

The actor environment registry entry has been properly updated from the Penguin path to NemoGym, mapping to the correct PY_EXECUTABLES.NEMO_GYM executable.

examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml (1)

233-234: LGTM! Configuration correctly updated for NeMo-Gym.

The data file paths, feature flags, and config section have all been consistently renamed from Penguin to NeMo-Gym terminology.

Also applies to: 239-241

nemo_rl/distributed/virtual_cluster.py (1)

58-59: LGTM! Python executable constant correctly updated.

The NEMO_GYM executable reference has been properly renamed and updated to use the correct extra package flag.

3rdparty/NeMo-Gym-workspace/pyproject.toml (1)

6-6: LGTM! Package metadata correctly updated.

The project name and description have been properly updated to reflect the NeMo-Gym naming.

Also applies to: 9-9

nemo_rl/experience/rollouts.py (2)

939-939: LGTM! Public API correctly renamed from Penguin to NeMo-Gym.

The dataclass, function name, return type, and docstring have all been consistently updated to reflect the NeMo-Gym terminology.

Also applies to: 958-967


970-1010: LGTM! Internal variables and assertions correctly updated.

All internal variable names (nemo_gym_rows, nemo_gym_environment) and assertion messages have been consistently renamed to use NeMo-Gym terminology.

3rdparty/NeMo-Gym-workspace/setup.py (3)

23-24: LGTM - Consistent rename to NeMo-Gym.

The comment and source directory path are correctly updated to reference NeMo-Gym instead of Penguin.


44-44: New dependency added: psutil.

Verify that psutil is also present in the NeMo-Gym submodule's pyproject.toml to maintain dependency consistency. The existing consistency check (lines 68-94) should catch any mismatch, but confirm this addition is intentional.


102-111: LGTM - Setup metadata correctly updated.

Package name, description, and py_modules are consistently renamed to nemo_gym and is_nemo_gym_installed.

nemo_rl/algorithms/grpo.py (4)

57-61: LGTM - Import renamed correctly.

The import is consistently renamed from run_async_penguin_rollout to run_async_nemo_gym_rollout.


1082-1097: LGTM - Training rollout block updated consistently.

Variable names and function call are correctly renamed to NeMo-Gym equivalents. The result extraction (input_ids, final_batch, rollout_metrics) is preserved.


1586-1600: LGTM - Validation rollout block updated consistently.

Same pattern as training - function call and variable names correctly renamed.


851-871: LGTM - Helper function renamed with updated config key.

The function _should_use_nemo_gym is correctly implemented with:

  • Updated docstring referencing NeMo-Gym
  • Config key properly accessing should_use_nemo_gym from the configuration
  • Local variable names using snake_case convention
  • Clear error messages referencing the NeMo-Gym requirement
  • YAML configs updated to use the new key (verified in examples/nemo_gym/grpo_dapo17k_bytedtsinghua_qwen3_4binstruct_nf.yaml)
  • Proper copyright header and style compliance
tests/unit/experience/test_rollouts.py (3)

35-40: LGTM - Imports correctly renamed.

The imports are consistently updated to NeMo-Gym equivalents.


45-52: LGTM - Fixture imports renamed correctly.

The fixture imports are correctly renamed to nemo_gym variants. The # noqa: F401 directives are appropriate since these are pytest fixtures imported for dependency injection into test functions.

Note: Static analysis hints flag these as "unused noqa directives" because the F401 rule appears to not be enabled in this project's Ruff configuration. The directives are harmless but could be removed if the team prefers cleaner code.


750-773: LGTM - Test function renamed with consistent fixture usage.

The test function and all related fixtures/variables are correctly renamed to NeMo-Gym equivalents:

  • NEMO_GYM_INSTALLED skip condition
  • test_run_async_nemo_gym_rollout function name
  • All fixture parameters renamed appropriately
  • Internal variable names updated consistently
examples/nemo_gym/run_grpo_nemo_gym.py (4)

39-57: LGTM - Imports correctly updated to NeMo-Gym.

All imports are consistently renamed to use NeMo-Gym equivalents from the updated module paths.


78-109: LGTM - Dataset setup function renamed with consistent variable names.

Function setup_single_nemo_gym_dataset and all internal variables (nemo_gym_examples, nemo_gym_example) are consistently renamed.


132-157: LGTM - Trajectory collection updated to use NeMo-Gym rollout.

The run_async_nemo_gym_rollout call and result variable nemo_gym_rollout_result are correctly renamed.


198-267: LGTM - Main function configuration and environment setup updated.

All NeMo-Gym related configuration and environment setup is correctly updated:

  • setup_nemo_gym_config call (line 199)
  • _should_use_nemo_gym assertion (line 202)
  • Config key access: config["env"]["nemo_gym"] (lines 250, 255)
  • NemoGymConfig and NemoGym actor instantiation
  • Task environment mapping: {"nemo_gym": nemo_gym} (line 266)
tests/unit/environments/test_nemo_gym.py (8)

26-26: LGTM - Imports updated to NeMo-Gym equivalents.

Correctly imports from the renamed module nemo_rl.environments.nemo_gym.


35-35: LGTM - Tokenizer fixture correctly aliased.

The tokenizer fixture is aliased as nemo_gym_tokenizer for clarity in this test module.


38-44: LGTM - Installation check updated to NeMo-Gym.

The try/except block correctly checks for nemo_gym package installation and sets the NEMO_GYM_INSTALLED flag.


47-54: LGTM - Stub module test renamed.

Test function correctly renamed to test_nemo_gym_stub_module with updated print message.


57-78: LGTM - vLLM generation fixture renamed.

Fixture nemo_gym_vllm_generation correctly uses setup_nemo_gym_config and dependency on nemo_gym_tokenizer.


81-132: LGTM - NeMo-Gym actor fixture correctly configured.

The fixture properly creates a NemoGym actor with:

  • NemoGymConfig configuration
  • Correct runtime environment path nemo_rl.environments.nemo_gym.NemoGym
  • Health check before yielding
  • Proper cleanup with shutdown and ray.kill

143-200: LGTM - Sanity test function renamed with consistent parameters.

Test function test_nemo_gym_sanity correctly uses renamed fixtures and variables throughout. The test logic remains functionally equivalent.


135-140: No action required—test data file exists at the specified path.

The file test_nemo_gym_sanity.json is present at tests/unit/environments/nemo_gym_test_data/ and the path construction in the fixture is correct. No old test data files remain.

nemo_rl/environments/nemo_gym.py (1)

234-250: nemo_gym_example_to_nemo_rl_datum_spec mapping looks consistent with DatumSpec.

The helper correctly wraps the raw nemo_gym_example in extra_env_info, assigns idx, and tags task_name="nemo_gym", matching the DatumSpec shape from nemo_rl.data.interfaces. The extra token_ids key is a pragmatic compatibility shim with current GRPO code.

@github-actions
Copy link

github-actions bot commented Dec 2, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: e7c9c8c (PR #1587 from tk/gym-rename)

✅ Submodules that are properly updated:

Gym: ✅ New submodule being added

All submodule changes look good! ✨

bxyu-nvidia
bxyu-nvidia previously approved these changes Dec 2, 2025
@github-actions
Copy link

github-actions bot commented Dec 2, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: df3be18 (PR #1587 from tk/gym-rename)

✅ Submodules that are properly updated:

Gym: ✅ New submodule being added

All submodule changes look good! ✨

@terrykong terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Dec 2, 2025
@terrykong terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 2, 2025
@terrykong terrykong enabled auto-merge (squash) December 2, 2025 20:55
@github-actions
Copy link

github-actions bot commented Dec 2, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: c3c1ef5 (PR #1587 from tk/gym-rename)

✅ Submodules that are properly updated:

Gym: ✅ New submodule being added

All submodule changes look good! ✨

bxyu-nvidia
bxyu-nvidia previously approved these changes Dec 3, 2025
@terrykong terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 4, 2025
@github-actions
Copy link

github-actions bot commented Dec 4, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: c1f272a (PR #1587 from tk/gym-rename)

✅ Submodules that are properly updated:

Gym: ✅ New submodule being added

All submodule changes look good! ✨

@terrykong terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 4, 2025
@github-actions
Copy link

github-actions bot commented Dec 4, 2025

✅ Submodule Fast-Forward Check Results

Check based on commit: 694b4b7 (PR #1587 from tk/gym-rename)

✅ Submodules that are properly updated:

Gym: ✅ New submodule being added

All submodule changes look good! ✨

@terrykong terrykong merged commit 23d2bed into main Dec 5, 2025
40 of 41 checks passed
@terrykong terrykong deleted the tk/gym-rename branch December 5, 2025 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rename penguin + add the nemo-gym submodule

4 participants