[unrevert] Add batch invariant kernel override for FlashInfer backend [2/n]#26373
[unrevert] Add batch invariant kernel override for FlashInfer backend [2/n]#26373yewentao256 merged 3 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Bram Wasti <bwasti@meta.com>
297ecc2 to
fcfa198
Compare
There was a problem hiding this comment.
Code Review
This pull request reinstates support for batch invariant kernel overrides for the FlashInfer backend, which is a crucial feature for ensuring deterministic outputs regardless of batch composition. The changes primarily involve updating the FlashInfer dependency and plumbing through new parameters to enable a deterministic mode in the backend. Additionally, the batch invariance tests have been significantly improved to be more robust, configurable, and to cover the FlashInfer backend. My review focuses on the correctness and robustness of these changes. The implementation for enabling batch invariance in FlashInfer appears solid. I've identified a couple of minor issues in the updated test file where a hardcoded value should be replaced with a configurable variable to make the test more robust. Overall, the changes are well-implemented and the test enhancements are excellent.
Signed-off-by: Bram Wasti <bwasti@meta.com>
fcfa198 to
5045121
Compare
yewentao256
left a comment
There was a problem hiding this comment.
Please fix the pre-commit issue so that we can get this landed
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Bram Wasti <bwasti@fb.com>
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: 1994 <1994@users.noreply.github.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: bbartels <benjamin@bartels.dev>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com>
… [2/n] (vllm-project#26373) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com>
This change reinstates already approved + landed #25769 based on the latest bump to flashinfer:
#26326
It should not land before #26326