More numerically stable plans

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Related
https://github.com/NVIDIA-NeMo/RL/issues/1227
https://github.com/NVIDIA-NeMo/RL/pull/1235

We've noticed in nemo-rl that some parallel plans are more numerically stable than other. In particular, when the o_proj/mlp.down_proj are `RowwiseParallel` you can get different evals depending on TP. @joyang-nv did a lot of heavy lifting to demonstrate this.

I would like to request that this be added as an option if someone wants to make the TP plan selected more numerically stable. I'm not sure it makes sense to always use this one b/c other frameworks like `vllm` will still use rowwiseparallel, so from the RL perspective of matching the implementation against the training and inference FW, it is more on-policy to leave it like this.

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

More numerically stable plans #535

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

More numerically stable plans #535

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions