Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用transformers launch本地的qwen2.5模型报错:RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None #2773

Closed
1 of 3 tasks
ganchun1130 opened this issue Jan 22, 2025 · 10 comments
Labels
Milestone

Comments

@ganchun1130
Copy link

ganchun1130 commented Jan 22, 2025

System Info / 系統信息

transformers 4.44.2
torch 2.4.1+cu124
xinference 1.2.0

cuda是12.4,没有使用docker

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

报错信息如下:

Launch model name: qwen2_5-chat with kwargs: {'model_path': '/mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct'}
Traceback (most recent call last):
File "/mnt/general/ganchun/miniconda3/envs/paper/bin/xinference", line 8, in
sys.exit(cli())
^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1161, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 908, in model_launch
model_uid = client.launch_model(
^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 999, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None

The command used to start Xinference / 用以启动 xinference 的命令

我使用的命令是:
xinference-local --host 0.0.0.0 --port 9997
xinference launch --model_path /mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct --model-engine Transformers -n qwen2_5-chat

Reproduction / 复现过程

xinference-local --host 0.0.0.0 --port 9997
xinference launch --model_path /mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct --model-engine Transformers -n qwen2_5-chat
然后报错:
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None

Expected behavior / 期待表现

不报错,且正常使用

@XprobeBot XprobeBot added the gpu label Jan 22, 2025
@XprobeBot XprobeBot added this to the v1.x milestone Jan 22, 2025
@qinxuye
Copy link
Contributor

qinxuye commented Jan 22, 2025

https://inference.readthedocs.io/en/latest/models/builtin/llm/qwen2.5-instruct.html#model-spec-1-pytorch-0-5-billion

model 名字不对,是 qwen2.5-instruct,可以参考文档来加载。

@ganchun1130
Copy link
Author

https://inference.readthedocs.io/en/latest/models/builtin/llm/qwen2.5-instruct.html#model-spec-1-pytorch-0-5-billion

model 名字不对,是 qwen2.5-instruct,可以参考文档来加载。

这的确有效,但是又有一个新错误:RuntimeError: Failed to launch model, detail: [address=0.0.0.0:45927, pid=17373] Error while deserializing header: HeaderTooLarge

请问可以解决么

@qinxuye
Copy link
Contributor

qinxuye commented Jan 22, 2025

完整错误贴下。

@ganchun1130
Copy link
Author

完整错误贴下。

解决了 谢谢你

@qinxuye qinxuye closed this as not planned Won't fix, can't repro, duplicate, stale Jan 22, 2025
@ganchun1130
Copy link
Author

完整错误贴下。

解决了 谢谢你

我还有一个小问题,如果我部署了两个qwen2.5的模型,请问我应该如何设置这个--model-name

@qinxuye
Copy link
Contributor

qinxuye commented Jan 22, 2025

完整错误贴下。

解决了 谢谢你

我还有一个小问题,如果我部署了两个qwen2.5的模型,请问我应该如何设置这个--model-name

同一个模型吗?可以设置副本(replica) 为 2。model name 指的是模型名字,只需要一个。--model-uid 才是实例的名字。

@ganchun1130
Copy link
Author

完整错误贴下。

解决了 谢谢你

我还有一个小问题,如果我部署了两个qwen2.5的模型,请问我应该如何设置这个--model-name

同一个模型吗?可以设置副本(replica) 为 2。model name 指的是模型名字,只需要一个。--model-uid 才是实例的名字。

明白了 我想使用的是不同的模型 比如一个7b 一个14b,我也需要这样设置对吗?

@qinxuye
Copy link
Contributor

qinxuye commented Jan 22, 2025

不同的不行,默认 uid 就是模型名字。

@ganchun1130
Copy link
Author

不同的不行,默认 uid 就是模型名字。

如果是同一个系列模型,但是参数规模不同,只能重新启动一个xinference的服务吗?

@qinxuye
Copy link
Contributor

qinxuye commented Jan 22, 2025

不同的不行,默认 uid 就是模型名字。

如果是同一个系列模型,但是参数规模不同,只能重新启动一个xinference的服务吗?

不用啊,直接 launch。不过前提是卡要足够。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants