Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 924 Bytes

README.md

File metadata and controls

18 lines (13 loc) · 924 Bytes

logo

quantkitty

inspired by mlabonne's autoquant (check out the thread to find out more about each type of quant), does not contaminate cross-cell allowing you to quant multiple in one go without restarting the kernel, allows for exl-requanting once measurement is done, shows progress during the model download, auto uploads your quant and much more, all in one portable jupyter notebook that can be dropped into a runpod for easy use, no colab required

Note

make sure to fill out your huggingface USERNAME and HF_TOKEN (you can create one in your settings) otherwise uploading your quant won't work.

Note

all quants are uploaded as private, so you can double check before publishing it to your profile

supports:

  • exl2 (with fast measurement requant)
  • awq
  • hqq
  • gptq
  • gguf