Skip to content

perf(glm52): MLA decode arena + CUDA graph capture on top of #535 (−36%/layer) + kernel bench#533

Open
n-WN wants to merge 8 commits into
openinfer-project:mainfrom
n-WN:feat/glm52-kernel-bench
Open

perf(glm52): MLA decode arena + CUDA graph capture on top of #535 (−36%/layer) + kernel bench#533
n-WN wants to merge 8 commits into
openinfer-project:mainfrom
n-WN:feat/glm52-kernel-bench

fix(glm52): the decode scratch owns its FlashMLA contract (review round)

c5173c2
Select commit
Loading
Failed to load commit list.
Sign in for the full log view
CPU checks
succeeded Jul 3, 2026 in 6m 6s