This repository contains the code and models used in the pape Cost-Optimal Grouped-Query Attention for Long-Context LLMs.
-
Notifications
You must be signed in to change notification settings - Fork 1
The code repository for the paper "Cost-Optimal Grouped-Query Attention for Long-Context LLMs"
thunlp/cost-optimal-gqa
About
The code repository for the paper "Cost-Optimal Grouped-Query Attention for Long-Context LLMs"
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published