Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage]: Does DeepSeek-R1 1.58-bit Dynamic Quant work on VLLM? #12573

Open
1 task done
shimmyshimmer opened this issue Jan 30, 2025 · 4 comments
Open
1 task done

[Usage]: Does DeepSeek-R1 1.58-bit Dynamic Quant work on VLLM? #12573

shimmyshimmer opened this issue Jan 30, 2025 · 4 comments
Labels
usage How to use vllm

Comments

@shimmyshimmer
Copy link

Your current environment

Hey guys! Recently in our blogpost we wrote that vLLM supports GGUFs however we've been getting many people saying that the R1 GGUFs don't actually work in VLLM at the moment and they get errors.

I'm guessing it's not supported at the moment? Thank you! :)

Blog: https://unsloth.ai/blog/deepseekr1-dynamic
Model: https://huggingface.co/unsloth/DeepSeek-R1-GGUF

How would you like to use vllm

Run DeepSeek-R1 1.58

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@shimmyshimmer shimmyshimmer added the usage How to use vllm label Jan 30, 2025
@robertgshaw2-redhat
Copy link
Collaborator

robertgshaw2-redhat commented Jan 30, 2025

Im not sure of the state. Can you try it?

@Isotr0py
Copy link
Collaborator

Isotr0py commented Jan 31, 2025

I'm afraid not. Because the GGUF support in vllm is dependent on the GGUF interoperability in transformers (we depend on it to extract hf_config from GGUF), and Deepseek and its GGUF interoperability hasn't supported in transformers yet: https://huggingface.co/docs/transformers/v4.48.2/en/gguf#supported-model-architectures

@shimmyshimmer
Copy link
Author

Im not sure of the state. Can you try it?

Oh yes we tested it a few hours unfortunately it doesnt work

I'm afraid not. Because the GGUF support in vllm is dependent on the GGUF interoperability in transformers (we depend on it to extract hf_config from GGUF), and Deepseek and its GGUF interoperability hasn't supported in transformers yet: https://huggingface.co/docs/transformers/v4.48.2/en/gguf#supported-model-architectures

Alright thanks so much for letting me know. Once it's supported we'll let others know as well :)

@dannydabbles
Copy link

dannydabbles commented Jan 31, 2025

Steps to reproduce the vLLM issues I'm seeing here via the standard vllm/vllm-openai:latest Docker image.

ValueError: GGUF model with architecture deepseek2 is not supported yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage How to use vllm
Projects
None yet
Development

No branches or pull requests

4 participants