-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用transformers launch本地的qwen2.5模型报错:RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None #2773
Comments
model 名字不对,是 qwen2.5-instruct,可以参考文档来加载。 |
这的确有效,但是又有一个新错误:RuntimeError: Failed to launch model, detail: [address=0.0.0.0:45927, pid=17373] Error while deserializing header: HeaderTooLarge 请问可以解决么 |
完整错误贴下。 |
解决了 谢谢你 |
我还有一个小问题,如果我部署了两个qwen2.5的模型,请问我应该如何设置这个--model-name |
同一个模型吗?可以设置副本(replica) 为 2。model name 指的是模型名字,只需要一个。--model-uid 才是实例的名字。 |
明白了 我想使用的是不同的模型 比如一个7b 一个14b,我也需要这样设置对吗? |
不同的不行,默认 uid 就是模型名字。 |
如果是同一个系列模型,但是参数规模不同,只能重新启动一个xinference的服务吗? |
不用啊,直接 launch。不过前提是卡要足够。 |
System Info / 系統信息
transformers 4.44.2
torch 2.4.1+cu124
xinference 1.2.0
cuda是12.4,没有使用docker
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
报错信息如下:
Launch model name: qwen2_5-chat with kwargs: {'model_path': '/mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct'}
Traceback (most recent call last):
File "/mnt/general/ganchun/miniconda3/envs/paper/bin/xinference", line 8, in
sys.exit(cli())
^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1161, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/xinference/deploy/cmdline.py", line 908, in model_launch
model_uid = client.launch_model(
^^^^^^^^^^^^^^^^^^^^
File "/mnt/general/ganchun/miniconda3/envs/paper/lib/python3.11/site-packages/xinference/client/restful/restful_client.py", line 999, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None
The command used to start Xinference / 用以启动 xinference 的命令
我使用的命令是:
xinference-local --host 0.0.0.0 --port 9997
xinference launch --model_path /mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct --model-engine Transformers -n qwen2_5-chat
Reproduction / 复现过程
xinference-local --host 0.0.0.0 --port 9997
xinference launch --model_path /mnt/general/ganchun/model/Qwen2.5-0.5B-Instruct --model-engine Transformers -n qwen2_5-chat
然后报错:
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:50410, pid=15235] Model not found, name: qwen2_5-chat, format: None, size: None, quantization: None
Expected behavior / 期待表现
不报错,且正常使用
The text was updated successfully, but these errors were encountered: