Skip to content

Conversation

terrykong
Copy link
Contributor

@terrykong terrykong commented Sep 30, 2025

A fix missed in my earlier commit: #1231

tested locally and passes

Summary by CodeRabbit

  • Tests
    • Adjusted validation to check token probability error thresholds at step 40 (previously 100) for LLM rollouts. Maintains existing gating on maximum step completion while improving CI reliability and reducing test flakiness. No impact on model behavior or user features.
  • Chores
    • Operational test tuning only; no user-facing functionality, performance, or UI changes.

@terrykong terrykong requested a review from a team as a code owner September 30, 2025 01:16
@terrykong terrykong added the CI:docs Run doctest label Sep 30, 2025
@terrykong terrykong enabled auto-merge (squash) September 30, 2025 01:16
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

📝 Walkthrough

Walkthrough

Adjusts a test assertion in the GRPO Llama3.1 8B Megatron FP8 rollouts script to compare the token_mult_prob_error threshold at index 40 instead of 100, with no other logic or command sequence changes.

Changes

Cohort / File(s) Change Summary
Test validation index adjustment
tests/test_suites/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v2.sh
Updated the checked index for train token_mult_prob_error threshold from 100 to 40; conditional checks and flow unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

CI:L0

Suggested reviewers

  • guyueh1
  • chtruong814

Pre-merge checks and finishing touches

✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely describes the primary change of updating the nightly FP8 rollout verification check from step 100 to step 40, matching the modifications in the test script without extraneous detail or noise.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Test Results For Major Changes ✅ Passed The change only adjusts the test suite’s assertion index from step 100 to step 40 to align with the configured MAX_STEPS value and does not alter any production code, numerical logic, or performance-critical paths; this is a minor test script correction that has been verified locally and therefore meets the criteria for a pass without requiring additional documented test results.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch tk/fp8-another-fix

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c2b36f2 and f36f212.

📒 Files selected for processing (1)
  • tests/test_suites/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v2.sh (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Follow the Google Shell Style Guide for all shell scripts
Use uv run to execute Python scripts in shell/driver scripts instead of activating virtualenvs and calling python directly
Add the NVIDIA copyright header (with current year) at the top of all shell scripts, excluding tests/ and test-only scripts

Files:

  • tests/test_suites/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v2.sh
tests/test_suites/llm/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

LLM driver script filenames must mirror the YAML base name and follow the same pattern with .sh extension

Files:

  • tests/test_suites/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v2.sh
tests/test_suites/**

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Place driver shell scripts and common.env under tests/test_suites// and list nightly tests in tests/test_suites/nightly.txt

Files:

  • tests/test_suites/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v2.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Docs_Tests
  • GitHub Check: Lint check
  • GitHub Check: Post automodel integration comment / Comment on PR
  • GitHub Check: Post submodule check comment / Comment on PR
🔇 Additional comments (1)
tests/test_suites/llm/grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts.v2.sh (1)

38-38: LGTM! Fix aligns assertion with MAX_STEPS.

The change correctly updates the index from 100 to 40, matching the MAX_STEPS=40 configuration. The previous value would have referenced a non-existent step since training only runs to step 40.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

  • Public repositories are always opted into early access features.
  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI:docs Run doctest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant