使用transformers launch本地的qwen2.5模型报错：RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None #2773

ganchun1130 · 2025-01-22T03:56:22Z

System Info / 系統信息

transformers 4.44.2
torch 2.4.1+cu124
xinference 1.2.0

cuda是12.4，没有使用docker

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

docker / docker
pip install / 通过 pip install 安装
installation from source / 从源码安装

Version info / 版本信息

报错信息如下：

Launch model name: qwen2_5-chat with kwargs: {'model_path': '/mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct'}
Traceback (most recent call last):
File "/mnt/general/ganchun/miniconda3/envs/paper/bin/xinference", line 8, in
sys.exit(cli())
^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1161, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 908, in model_launch
model_uid = client.launch_model(
^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 999, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None

The command used to start Xinference / 用以启动 xinference 的命令

我使用的命令是：
xinference-local --host 0.0.0.0 --port 9997
xinference launch --model_path /mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct --model-engine Transformers -n qwen2_5-chat

Reproduction / 复现过程

xinference-local --host 0.0.0.0 --port 9997
xinference launch --model_path /mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct --model-engine Transformers -n qwen2_5-chat
然后报错：
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None

Expected behavior / 期待表现

不报错，且正常使用

qinxuye · 2025-01-22T04:02:03Z

https://inference.readthedocs.io/en/latest/models/builtin/llm/qwen2.5-instruct.html#model-spec-1-pytorch-0-5-billion

model 名字不对，是 qwen2.5-instruct，可以参考文档来加载。

ganchun1130 · 2025-01-22T04:10:20Z

https://inference.readthedocs.io/en/latest/models/builtin/llm/qwen2.5-instruct.html#model-spec-1-pytorch-0-5-billion

model 名字不对，是 qwen2.5-instruct，可以参考文档来加载。

这的确有效，但是又有一个新错误：RuntimeError: Failed to launch model, detail: [address=0.0.0.0:45927, pid=17373] Error while deserializing header: HeaderTooLarge

请问可以解决么

qinxuye · 2025-01-22T05:11:03Z

完整错误贴下。

ganchun1130 · 2025-01-22T05:15:02Z

完整错误贴下。

解决了谢谢你

ganchun1130 · 2025-01-22T05:38:23Z

完整错误贴下。

解决了谢谢你

我还有一个小问题，如果我部署了两个qwen2.5的模型，请问我应该如何设置这个--model-name

qinxuye · 2025-01-22T06:38:07Z

完整错误贴下。

解决了谢谢你

我还有一个小问题，如果我部署了两个qwen2.5的模型，请问我应该如何设置这个--model-name

同一个模型吗？可以设置副本（replica）为 2。model name 指的是模型名字，只需要一个。--model-uid 才是实例的名字。

ganchun1130 · 2025-01-22T08:43:32Z

完整错误贴下。

解决了谢谢你

我还有一个小问题，如果我部署了两个qwen2.5的模型，请问我应该如何设置这个--model-name

同一个模型吗？可以设置副本（replica）为 2。model name 指的是模型名字，只需要一个。--model-uid 才是实例的名字。

明白了我想使用的是不同的模型比如一个7b 一个14b，我也需要这样设置对吗？

qinxuye · 2025-01-22T08:44:34Z

不同的不行，默认 uid 就是模型名字。

ganchun1130 · 2025-01-22T09:03:32Z

不同的不行，默认 uid 就是模型名字。

如果是同一个系列模型，但是参数规模不同，只能重新启动一个xinference的服务吗？

qinxuye · 2025-01-22T09:20:46Z

不同的不行，默认 uid 就是模型名字。

如果是同一个系列模型，但是参数规模不同，只能重新启动一个xinference的服务吗？

不用啊，直接 launch。不过前提是卡要足够。

XprobeBot added the gpu label Jan 22, 2025

XprobeBot added this to the v1.x milestone Jan 22, 2025

qinxuye closed this as not planned Won't fix, can't repro, duplicate, stale Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用transformers launch本地的qwen2.5模型报错：RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None #2773

使用transformers launch本地的qwen2.5模型报错：RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None #2773

ganchun1130 commented Jan 22, 2025 •

edited

Loading

qinxuye commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025 •

edited

Loading

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025

使用transformers launch本地的qwen2.5模型报错：RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None #2773

使用transformers launch本地的qwen2.5模型报错：RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None #2773

Comments

ganchun1130 commented Jan 22, 2025 • edited Loading

System Info / 系統信息

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

Version info / 版本信息

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

Expected behavior / 期待表现

qinxuye commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025 • edited Loading

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025

qinxuye commented Jan 22, 2025

ganchun1130 commented Jan 22, 2025 •

edited

Loading

qinxuye commented Jan 22, 2025 •

edited

Loading