-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Related
NVIDIA-NeMo/RL#1227
NVIDIA-NeMo/RL#1235
We've noticed in nemo-rl that some parallel plans are more numerically stable than other. In particular, when the o_proj/mlp.down_proj are RowwiseParallel
you can get different evals depending on TP. @joyang-nv did a lot of heavy lifting to demonstrate this.
I would like to request that this be added as an option if someone wants to make the TP plan selected more numerically stable. I'm not sure it makes sense to always use this one b/c other frameworks like vllm
will still use rowwiseparallel, so from the RL perspective of matching the implementation against the training and inference FW, it is more on-policy to leave it like this.
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.