Skip to content

More numerically stable plans #535

@terrykong

Description

@terrykong

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Related
NVIDIA-NeMo/RL#1227
NVIDIA-NeMo/RL#1235

We've noticed in nemo-rl that some parallel plans are more numerically stable than other. In particular, when the o_proj/mlp.down_proj are RowwiseParallel you can get different evals depending on TP. @joyang-nv did a lot of heavy lifting to demonstrate this.

I would like to request that this be added as an option if someone wants to make the TP plan selected more numerically stable. I'm not sure it makes sense to always use this one b/c other frameworks like vllm will still use rowwiseparallel, so from the RL perspective of matching the implementation against the training and inference FW, it is more on-policy to leave it like this.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions