-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
想在华为昇腾NPU 910B4上用k8s环境部署paddlenlp-uie推理,目前paddlepaddle==2.5.2/paddlenlp==2.6.1/paddleocr==2.6.1.3发现推理很慢,怎么回事,求指导? #8606
Comments
npu k8s部署文件如下: |
dockerfile如下: RUN pip install --disable-pip-version-check --no-cache-dir -i https://mirrors.aliyun.com/pypi/simple paddlepaddle==2.5.2; 复制代码到工作目录COPY . /usr/src/app/ WORKDIR /usr/src/app 设置容器启动时执行的命令CMD ["python", "/usr/src/app/medical_report_ocr.py"] |
(base) PS C:\Users\12133> kubectl get pod -n hwei (base) PS C:\Users\12133> kubectl describe pod hwei-ocr -n hwei (base) PS C:\Users\12133> kubectl logs -f hwei-ocr -n hwei [2024-06-14 12:52:16,925] [ INFO] - All the weights of UIEM were initialized from the model checkpoint at /home/ma-user/.paddlenlp/taskflow/information_extraction/uie-m-base. 推理也正常进行,但是极其慢,一张医院检测报告推理用了(信息提取 5840.636544704437s),请问这个怎么回事?是paddlepaddle,paddlenip,paddleocr的版本不对吗?还是其他配置有问题,请指导,谢谢! |
难道paddlepaddle,paddlenip,paddleocr这3个要自己build armv8体系架构版本?还是说我推理运行时要指派具体NPU 用python -npu -device app.py等命令?目前我申请的cpu和npu资源如下: |
CMD ["python", "/usr/src/app/medical_report_ocr.py"]中medical_report_ocr.py代码如下: import paddlenlp, paddleocr print(paddlenlp.version) #start = time.time() start = time.time() start = time.time() start1 = time.time() |
@guoshengCS 麻烦帮忙看看,多谢! |
@AllenMeng2009 想问一下,在Taskflow加载了openvino有加速吗? |
@daytime25 您好!没有用到openvino加速,请问如果用openvino加速,是直接pip install --upgrade --user openvino-dev,还需要paddlenlp_ov.zip包?(此包在哪里下载?),这时下面语句才会生效吧?多谢! |
@daytime25 您好!我目前尝试pip install --upgrade --user openvino-dev进行了安装,并且在Taskflow中配置了predictor_type= "openvino-inference",发现没有效果,难道要下载paddlenlp_ov.zip?然后 |
@AllenMeng2009 需要下载paddlenlp_ov.zip,这个文件和原始的paddlenlp不一样代码改了,我这里报错ENABLE_TORCH_CHECKPOINT,就修改了model_utils.py里面的from paddlenlp.utils.env >> from paddlenlp_ov.utils.env,这里使用场景是intel的cpu能实现从耗时30多秒缩减到18秒。现在有个问题是部署成serving报错,看到帖子说是要改输出才行 |
@daytime25,您好! paddlenlp_ov.zip在哪下载呢?openvino加速能对nvidia GPU有效吗?我后来切换到2张nvidia A800 80G的gpu了,目前加载uie-x-base模型要7s左右,推理一张医学检测报告要5-10s,还是很慢,有没有其他方案提速呢?谢谢 |
你好,我想问一下,你是使用的项目的docker镜像部署吗,那么在昇腾服务器上你安装的驱动和固件等等环境是什么版本,在物理机还是在容器内,都有什么包和版本要求呢 |
请提出你的问题
如题
The text was updated successfully, but these errors were encountered: