Skip to content

[llama] Store KV Cache on CPU and Use PyTorch SPDA for Next token generation#1182

Open
zhentaoyu wants to merge 8 commits intohuggingface:mainfrom
zhentaoyu:cpu_sdpa
Open

[llama] Store KV Cache on CPU and Use PyTorch SPDA for Next token generation#1182
zhentaoyu wants to merge 8 commits intohuggingface:mainfrom
zhentaoyu:cpu_sdpa

Commits

Commits on Dec 6, 2024