Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

turboderp-org / exllamav2 Public

Notifications You must be signed in to change notification settings
Fork 305
Star 4k

Code
Issues 103
Pull requests 17
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: turboderp-org/exllamav2

Releases Tags

Releases · turboderp-org/exllamav2

0.0.20

27 Apr 00:56

github-actions

v0.0.20

68f1eba

Compare

Choose a tag to compare

View all tags

0.0.20

Adds Phi3 support
Wheels compiled for PyTorch 2.3.0
ROCm 6.0 wheels

Full Changelog: v0.0.19...v0.0.20

Assets 32

drxmy, LeiWang1999, venetanji, firengate, and Mar2ck reacted with thumbs up emoji

firengate reacted with laugh emoji

firengate reacted with hooray emoji

firengate and Mar2ck reacted with heart emoji

firengate reacted with rocket emoji

All reactions

👍 5 reactions
😄 1 reaction
🎉 1 reaction
❤️ 2 reactions
🚀 1 reaction

5 people reacted

0.0.19

19 Apr 06:44

github-actions

v0.0.19

ed118b4

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

0.0.19

More accurate Q4 cache using groupwise rotations
Better prompt ingestion speed when using flash-attn
Minor fixes related to issues quantizing Llama 3
New, more robust optimizer
Fix bug on long-sequence inference for GPTQ models

Full Changelog: v0.0.18...v0.0.19

Assets 32

firengate, Mar2ck, acidbubbles, cmhamiche, xhinker, and alok-abhishek reacted with thumbs up emoji

firengate reacted with laugh emoji

firengate and akaszynski reacted with hooray emoji

firengate, Mar2ck, and acidbubbles reacted with heart emoji

firengate reacted with rocket emoji

All reactions

👍 6 reactions
😄 1 reaction
🎉 2 reactions
❤️ 3 reactions
🚀 1 reaction

7 people reacted

0.0.18

07 Apr 18:41

github-actions

v0.0.18

dafb508

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

0.0.18

Support for Command-R-plus
Fix for pre-AVX2 CPUs
VRAM optimizations for quantization
Very preliminary multimodal support
Various other small fixes and optimizations

Full Changelog: v0.0.17...v0.0.18

Assets 31

firengate, LeoYelton, and Maykeye reacted with thumbs up emoji

firengate reacted with laugh emoji

firengate reacted with hooray emoji

firengate and marcasmed reacted with heart emoji

FlareP1, bartowski1182, drxmy, and firengate reacted with rocket emoji

All reactions

👍 3 reactions
😄 1 reaction
🎉 1 reaction
❤️ 2 reactions
🚀 4 reactions

7 people reacted

0.0.17

31 Mar 03:19

github-actions

v0.0.17

f6b7faa

Compare

Choose a tag to compare

View all tags

0.0.17

Mostly just minor fixes and support for DBRX models.

Full Changelog: v0.0.16...v0.0.17

Assets 31

firengate, JoeySalmons, drxmy, and linkage001 reacted with thumbs up emoji

firengate reacted with laugh emoji

firengate reacted with hooray emoji

firengate reacted with heart emoji

firengate and Josephrp reacted with rocket emoji

All reactions

👍 4 reactions
😄 1 reaction
🎉 1 reaction
❤️ 1 reaction
🚀 2 reactions

5 people reacted

0.0.16

20 Mar 07:23

github-actions

v0.0.16

48925b4

Compare

Choose a tag to compare

View all tags

0.0.16

Adds support for Cohere models
N-gram decoding
A few bugfixes
Lots of optimizations

Full Changelog: v0.0.15...v0.0.16

Assets 31

firengate reacted with thumbs up emoji

firengate reacted with laugh emoji

jepjoo, BetaDoggo, and firengate reacted with hooray emoji

firengate reacted with heart emoji

TheZennou and firengate reacted with rocket emoji

All reactions

👍 1 reaction
😄 1 reaction
🎉 3 reactions
❤️ 1 reaction
🚀 2 reactions

4 people reacted

0.0.15

07 Mar 02:26

github-actions

v0.0.15

c60ac6e

Compare

Choose a tag to compare

View all tags

0.0.15

Adds Q4 cache mode
Support for StarCoder2
Minor optimizations and a couple of bugfixes

Full Changelog: v0.0.14...v0.0.15

Assets 31

firengate and Maykeye reacted with thumbs up emoji

firengate reacted with laugh emoji

jepjoo, firengate, and Mar2ck reacted with hooray emoji

firengate, ivsanro1, and Mar2ck reacted with heart emoji

firengate and ivsanro1 reacted with rocket emoji

All reactions

👍 2 reactions
😄 1 reaction
🎉 3 reactions
❤️ 3 reactions
🚀 2 reactions

5 people reacted

0.0.14

24 Feb 05:54

github-actions

v0.0.14

67af1d1

Compare

Choose a tag to compare

View all tags

0.0.14

Adds support for Qwen1.5 and Gemma architectures.

Various fixes and optimizations.

Full Changelog since 0.0.13: v0.0.13...v0.0.14

Assets 31

alicat22, bartowski1182, biship, firengate, and akaszynski reacted with thumbs up emoji

firengate reacted with laugh emoji

frammiie and firengate reacted with hooray emoji

firengate reacted with heart emoji

firengate reacted with rocket emoji

All reactions

👍 5 reactions
😄 1 reaction
🎉 2 reactions
❤️ 1 reaction
🚀 1 reaction

6 people reacted

0.0.13.post2

15 Feb 00:28

turboderp

0.0.13.post2

75f969a

Compare

Choose a tag to compare

View all tags

0.0.13.post2

Full Changelog: 0.0.13.post1...0.0.13.post2

Assets 32

firengate reacted with thumbs up emoji

firengate reacted with laugh emoji

firengate reacted with hooray emoji

firengate reacted with heart emoji

firengate reacted with rocket emoji

All reactions

👍 1 reaction
😄 1 reaction
🎉 1 reaction
❤️ 1 reaction
🚀 1 reaction

1 person reacted

0.0.13.post1

04 Feb 23:11

turboderp

0.0.13.post1

825929a

Compare

Choose a tag to compare

View all tags

0.0.13.post1

Fixes inference on models with vocab sizes that are not multiples of 32

Assets 32

pabl-o-ce, drxmy, firengate, xpgx1, and ivanbaldo reacted with hooray emoji

All reactions

🎉 5 reactions

5 people reacted

0.0.13

02 Feb 18:17

github-actions

v0.0.13

7feed80

Compare

Choose a tag to compare

View all tags

0.0.13

This release is mostly to update the prebuilt wheels to Torch 2.2, since it won't load extensions built for earlier versions.

Adds dynamic temperature and quadratic sampling. Fixes performance degradation on some GPUs after batch optimizations and various other little things.

Assets 31

firengate and Maykeye reacted with thumbs up emoji

firengate reacted with laugh emoji

firengate reacted with hooray emoji

firengate and ivanbaldo reacted with heart emoji

firengate, Qubitium, ivanbaldo, and akaszynski reacted with rocket emoji

All reactions

👍 2 reactions
😄 1 reaction
🎉 1 reaction
❤️ 2 reactions
🚀 4 reactions

5 people reacted

Previous 1 2 3 4 Next

Previous Next

Footer

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.