Are there plans to make exllama pip installable? #152
jason-brian-anderson
started this conversation in
Ideas
Replies: 2 comments
-
Looking around in oobabooga, i see: https://github.com/jllllll/exllama which is a python module version of exllama, so this is what i need. i'll leave this thread here in case anyone else has the same question. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I second this. I don't like to come to somebody's repo and tell them they're doing things wrong. So - my apologies. But wouldn't it make a lot more sense for exllama to be a library instead of a standalone application. You can always take the chat code and make it it's own repo that uses exllama. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Unless i'm just clueless, exllama is the most efficient model loader out there both in terms of performance and vram. wouldn't the natural thing be to make it pip installable so that it can power apps?
I want to build a framework on top of a fast loader and need the absolute best performance on a 4090 24gb re: it/s. As far as i can tell, , my only real option for that is to fork the exllama repo. Doesn't seem like a fork makes sense if the framework is much bigger and unrelated and just uses exllama as a loader.
Alternatively, maybe the docker config could be made more flexible and less tied to the web ui startup? something that could be FROM'ed in a Dockerfile?
apologies if i missing the obvious. this project opens so many doors...
Beta Was this translation helpful? Give feedback.
All reactions