Skip to content

Commit

Permalink
fix
Browse files Browse the repository at this point in the history
  • Loading branch information
qinxuye committed Jan 31, 2025
1 parent b33e353 commit f9ba1c4
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 4 deletions.
2 changes: 1 addition & 1 deletion doc/source/models/builtin/llm/qwen2-vl-instruct.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Model Spec 5 (mlx, 2 Billion)
- **Quantizations:** 4bit, 8bit
- **Engines**: MLX
- **Model ID:** mlx-community/Qwen2-VL-2B-Instruct-{quantization}
- **Model Hubs**: `Hugging Face <https://huggingface.co/mlx-community/Qwen2-VL-2B-Instruct-{quantization}>`__, `ModelScope <https://modelscope.cn/models/okwinds/Qwen2-VL-2B-Instruct-MLX-8bit>`__
- **Model Hubs**: `Hugging Face <https://huggingface.co/mlx-community/Qwen2-VL-2B-Instruct-{quantization}>`__, `ModelScope <https://modelscope.cn/models/mlx-community/Qwen2-VL-2B-Instruct-{quantization}>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::
Expand Down
6 changes: 3 additions & 3 deletions doc/source/models/builtin/llm/qwen2.5-vl-instruct.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ Model Spec 4 (mlx, 3 Billion)
- **Model Format:** mlx
- **Model Size (in billions):** 3
- **Quantizations:** 3bit, 4bit, 6bit, 8bit, bf16
- **Engines**: Transformers, MLX
- **Engines**: MLX
- **Model ID:** mlx-community/Qwen2.5-VL-3B-Instruct-{quantization}
- **Model Hubs**: `Hugging Face <https://huggingface.co/mlx-community/Qwen2.5-VL-3B-Instruct-{quantization}>`__, `ModelScope <https://modelscope.cn/models/mlx-community/Qwen2.5-VL-3B-Instruct-{quantization}>`__

Expand All @@ -84,7 +84,7 @@ Model Spec 5 (mlx, 7 Billion)
- **Model Format:** mlx
- **Model Size (in billions):** 7
- **Quantizations:** 3bit, 4bit, 6bit, 8bit, bf16
- **Engines**: Transformers, MLX
- **Engines**: MLX
- **Model ID:** mlx-community/Qwen2.5-VL-7B-Instruct-{quantization}
- **Model Hubs**: `Hugging Face <https://huggingface.co/mlx-community/Qwen2.5-VL-7B-Instruct-{quantization}>`__, `ModelScope <https://modelscope.cn/models/mlx-community/Qwen2.5-VL-7B-Instruct-{quantization}>`__

Expand All @@ -100,7 +100,7 @@ Model Spec 6 (mlx, 72 Billion)
- **Model Format:** mlx
- **Model Size (in billions):** 72
- **Quantizations:** 3bit, 4bit, 6bit, 8bit, bf16
- **Engines**: Transformers, MLX
- **Engines**: MLX
- **Model ID:** mlx-community/Qwen2.5-VL-72B-Instruct-{quantization}
- **Model Hubs**: `Hugging Face <https://huggingface.co/mlx-community/Qwen2.5-VL-72B-Instruct-{quantization}>`__, `ModelScope <https://modelscope.cn/models/mlx-community/Qwen2.5-VL-72B-Instruct-{quantization}>`__

Expand Down
2 changes: 2 additions & 0 deletions xinference/model/llm/transformers/qwen2_vl.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ def __init__(self, *args, **kwargs):
def match(
cls, model_family: "LLMFamilyV1", model_spec: "LLMSpecV1", quantization: str
) -> bool:
if model_spec.model_format not in ["pytorch", "gptq", "awq"]:
return False
llm_family = model_family.model_family or model_family.model_name
if "qwen2-vl-instruct".lower() in llm_family.lower():
return True
Expand Down

0 comments on commit f9ba1c4

Please sign in to comment.