Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLOv5-Lite (onnx)(v1.5版本 5月22日) 类似报错:1.ValueError: operands could not be broadcast together with shapes (3,2,80,85) (19200,2) 2. IndexError: index 4 is out of bounds for axis 0 with size 3 #263

Open
inbigtoiletboy opened this issue May 28, 2024 · 12 comments

Comments

@inbigtoiletboy
Copy link

报错流程:
1. 使用 YOLOv5-Litetrain.py 训练了一个 best.pt
2. 在 export.py 中转换为 best.onnx 模型
3. (第一种报错)使用 python_demo/onnxruntime/v5lite.py 进行推理,报了以下错误:

(1, 3, 80, 80, 85)
Traceback (most recent call last):
  File "C:\Users\*\Desktop\YOLOv5-Lite-master\python_demo\onnxruntime\v5lite.py", line 109, in <module>
    srcimg = net.detect(srcimg.copy())
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\*\Desktop\YOLOv5-Lite-master\python_demo\onnxruntime\v5lite.py", line 88, in detect
    srcimg = self.postprocess(srcimg, outs, (newh, neww, top, left))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\*\Desktop\YOLOv5-Lite-master\python_demo\onnxruntime\v5lite.py", line 46, in postprocess
    scores, classId = detection[4], detection[5]
                      ~~~~~~~~~^^^
IndexError: index 4 is out of bounds for axis 0 with size 3

进程已结束,退出代码1

原因是 best.onnx 的模型输出不匹配(大家可以使用 https://netron.app/ 查看自己训练的模型)
onnx

/----------------第一种报错的解决方法----------------/

第一步:在 2. 在 export.py 中转换为 best.onnx 模型 这一步中,使用 end2end 导出 :
假设 你原本是这样运行 export.py

python .\export.py

那么使用 end2end 导出就是在后面加上 --end2end 如:

python .\export.py --end2end
使用end2end的作用:
    --把后处理写在模型了,推理时就不用写后处理

这样你的 best.onnx 的输出格式就正确了,如下图:
image

float32[Gatheroutputs_dim_0,6]中**Gatheroutputs_dim_0**的意思是:
    --使用上一步的聚合
    --它的输出是动态的,目前onnxruntime支持,ascend支持,trt支持,ncnn和mnn不支持

第二步:修改 python_demo/onnxruntime/v5lite.py 中的代码:

代码在 **2024/5/20** 之后下载的YOLOv5-Lite就不需要改了
https://github.com/ppogg/YOLOv5-Lite/blob/master/python_demo/onnxruntime/v5lite.py

将其全部复制粘贴到 python_demo/onnxruntime/v5lite.py 中,
修改完各项参数后,就可以进行推理了
运行 v5lite.py 结果如图:
image
推理之后的图片在 python_demo/onnxruntime/save.jpg

如果你(2024/5/20 之前下载的YOLOv5-Lite)通过以上代码无法解决你的问题,你可以重新下载 YOLOv5-Lite.zip

/-------------------------------------------------------------/

第二种报错如下:

C:\Users\*\Desktop\YOLOv5-Lite-master\venv\Scripts\python.exe C:\Users\*\Desktop\YOLOv5-Lite-master\onnx3.py 
True
Traceback (most recent call last):
  File "C:\Users\*\Desktop\YOLOv5-Lite-master\onnx3.py", line 141, in <module>
    det_boxes, scores, ids = infer_img(img0, net, model_h, model_w, nl, na, stride, anchor_grid,
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\*\Desktop\YOLOv5-Lite-master\onnx3.py", line 102, in infer_img
    outs = cal_outputs(outs, nl, na, model_w, model_h, anchor_grid, stride)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\*\Desktop\YOLOv5-Lite-master\onnx3.py", line 58, in cal_outputs
    outs[row_ind:row_ind + length, 0:2] = (outs[row_ind:row_ind + length, 0:2] * 2. - 0.5 + np.tile(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: operands could not be broadcast together with shapes (3,2,80,85) (19200,2) 

进程已结束,退出代码1

如果你也遇到了这样的报错,那你大概率看过这篇文章:

【基于树莓派4B的YOLOv5-Lite目标检测的移植与部署(含训练教程)】
https://blog.csdn.net/black_sneak/article/details/131374492

报错原因:
这篇文章的作者使用的摄像头画面推理程序是改写 当时YOLOv5-Lite 中的代码,时间已经过去很久了,
所以不可以使用现在训练出来的 onnx模型 去跑当时的代码。

/----------------第二种报错的解决方法----------------/

首先要解决第一个报错之后,再解决第二个报错
重新写 摄像头画面推理程序

其实很简单,把现在的 python_demo/onnxruntime/v5lite.py 中的
class yolov5_lite() {...} 的代码粘贴到 原来的摄像头画面推理程序 的代码中,然后再照着v5lite.py改一下 main函数 就好了,非常的简单

我在此声明:下面的图片是我识别音游的画面,只是用来学习(因为识别出来的画面感很强),所以不会用于制作外挂。
抵制外挂,从你我做起

image

以下是改完之后的代码,我不建议复制粘贴直接用,自己也试着写写,你可以的



import argparse

import cv2
import numpy as np
import onnxruntime as ort
import time


class yolov5_lite():
    def __init__(self, model_pb_path, label_path, confThreshold=0.5, nmsThreshold=0.5):
        so = ort.SessionOptions()
        so.log_severity_level = 3
        self.net = ort.InferenceSession(model_pb_path, so)
        self.classes = list(map(lambda x: x.strip(), open(label_path, 'r').readlines()))

        self.confThreshold = confThreshold
        self.nmsThreshold = nmsThreshold
        self.input_shape = (self.net.get_inputs()[0].shape[2], self.net.get_inputs()[0].shape[3])

    def letterBox(self, srcimg, keep_ratio=True):
        top, left, newh, neww = 0, 0, self.input_shape[0], self.input_shape[1]
        if keep_ratio and srcimg.shape[0] != srcimg.shape[1]:
            hw_scale = srcimg.shape[0] / srcimg.shape[1]
            if hw_scale > 1:
                newh, neww = self.input_shape[0], int(self.input_shape[1] / hw_scale)
                img = cv2.resize(srcimg, (neww, newh), interpolation=cv2.INTER_AREA)
                left = int((self.input_shape[1] - neww) * 0.5)
                img = cv2.copyMakeBorder(img, 0, 0, left, self.input_shape[1] - neww - left, cv2.BORDER_CONSTANT,
                                         value=0)  # add border
            else:
                newh, neww = int(self.input_shape[0] * hw_scale), self.input_shape[1]
                img = cv2.resize(srcimg, (neww, newh), interpolation=cv2.INTER_AREA)
                top = int((self.input_shape[0] - newh) * 0.5)
                img = cv2.copyMakeBorder(img, top, self.input_shape[0] - newh - top, 0, 0, cv2.BORDER_CONSTANT, value=0)
        else:
            img = cv2.resize(srcimg, self.input_shape, interpolation=cv2.INTER_AREA)
        return img, newh, neww, top, left

    def postprocess(self, frame, outs, pad_hw):
        newh, neww, padh, padw = pad_hw
        frameHeight = frame.shape[0]
        frameWidth = frame.shape[1]
        ratioh, ratiow = frameHeight / newh, frameWidth / neww
        classIds = []
        confidences = []
        boxes = []
        for detection in outs:
            scores, classId = detection[4], detection[5]
            if scores > self.confThreshold:  # and detection[4] > self.objThreshold:
                x1 = int((detection[0] - padw) * ratiow)
                y1 = int((detection[1] - padh) * ratioh)
                x2 = int((detection[2] - padw) * ratiow)
                y2 = int((detection[3] - padh) * ratioh)
                classIds.append(classId)
                confidences.append(scores)
                boxes.append([x1, y1, x2, y2])

        # # Perform non maximum suppression to eliminate redundant overlapping boxes with
        # # lower confidences.
        indices = cv2.dnn.NMSBoxes(boxes, confidences, self.confThreshold, self.nmsThreshold)

        for ind in indices:
            frame = self.drawPred(frame, classIds[ind], confidences[ind], boxes[ind][0], boxes[ind][1], boxes[ind][2], boxes[ind][3])
        return frame

    def drawPred(self, frame, classId, conf, x1, y1, x2, y2):
        # Draw a bounding box.
        cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 0, 255), thickness=2)

        label = '%.2f' % conf
        text = '%s:%s' % (self.classes[int(classId)], label)

        # Display the label at the top of the bounding box
        labelSize, baseLine = cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
        y1 = max(y1, labelSize[1])
        cv2.putText(frame, text, (x1, y1 - 10), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (0, 255, 0), thickness=1)
        return frame

    def detect(self, srcimg):
        img, newh, neww, top, left = self.letterBox(srcimg)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = img.astype(np.float32) / 255.0
        blob = np.expand_dims(np.transpose(img, (2, 0, 1)), axis=0)

        t1 = time.time()
        outs = self.net.run(None, {self.net.get_inputs()[0].name: blob})[0]
        cost_time = time.time() - t1
        print(outs.shape)

        srcimg = self.postprocess(srcimg, outs, (newh, neww, top, left))
        infer_time = 'Inference Time: ' + str(int(cost_time * 1000)) + 'ms'
        cv2.putText(srcimg, infer_time, (5, 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, (0, 0, 0), thickness=1)
        return srcimg



if __name__ == "__main__":

    parser = argparse.ArgumentParser()
    # parser.add_argument('--imgpath', type=str, default='image.jpg', help="image path")
    parser.add_argument('--modelpath', type=str, default='best.onnx', help="onnx filepath")
    parser.add_argument('--classfile', type=str, default='dog.names', help="classname filepath")
    parser.add_argument('--confThreshold', default=0.5, type=float, help='class confidence')
    parser.add_argument('--nmsThreshold', default=0.6, type=float, help='nms iou thresh')
    args = parser.parse_args()

    net = yolov5_lite(args.modelpath, args.classfile, confThreshold=args.confThreshold, nmsThreshold=args.nmsThreshold)
    
    #Capture
    video = 0
    cap = cv2.VideoCapture(video)
    flag_det = False
    while True:
        success, img0 = cap.read()
        if success:

            if flag_det:
                t1 = time.time()
                img0 = net.detect(img0.copy())
                t2 = time.time()

                str_FPS = "FPS: %.2f" % (1. / (t2 - t1))

                cv2.putText(img0, str_FPS, (50, 50), cv2.FONT_HERSHEY_COMPLEX, 1, (0, 255, 0), 3)

            cv2.imshow("video", img0)
        key = cv2.waitKey(1) & 0xFF
        if key == ord('q'):

            break
        elif key & 0xFF == ord('s'):
            flag_det = not flag_det
            print(flag_det)

    cap.release()

/-------------------------------------------------------------/

以上就是全部内容了
感谢 YOLOv5-Lite 作者 pogg 的大力帮助,非常感谢。

@ppogg
Copy link
Owner

ppogg commented May 30, 2024

Get it,thinks!

@defzhangaa
Copy link

我是在导出onnx的时候,添加了 --grid参数,就可以用了

@ppogg ppogg pinned this issue Jun 22, 2024
@7enterprise
Copy link

thanks!!

@swz001
Copy link

swz001 commented Aug 2, 2024

1.5版本使用--concat导出onnx就可以用那个摄像头的demo了

@ZJDATY
Copy link

ZJDATY commented Aug 19, 2024

1.5版本使用--concat导出onnx 。还有一点不一样,之前的输出节点名称是output,现在添加 --concat 的形式,节点名称是outputs,如果之前写的推理程序,解析输出用的是节点名称的话,就会找不到output。建议统一。

@EdwardML
Copy link

衷心感谢,瞎忙活一天。

@xiaoer99015
Copy link

你好,这个dog.names是什么文件?

@zhangmingsu
Copy link

zhangmingsu commented Nov 7, 2024 via email

@xiaoer99015
Copy link

里面是填你训练的图片的label名称的,比如说训练时标签名填的person,里面就填person,我暂时没发现这个名称有什么特别含义,应该只是个文件名?  @.***  

------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2024年11月6日(星期三) 晚上10:07 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [ppogg/YOLOv5-Lite] YOLOv5-Lite (onnx)(v1.5版本 5月22日) 类似报错:1.ValueError: operands could not be broadcast together with shapes (3,2,80,85) (19200,2) 2. IndexError: index 4 is out of bounds for axis 0 with size 3 (Issue #263) 你好,这个dog.names是什么文件? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

好的,谢谢

@xiaoer99015
Copy link

你好,这个dog.names多个标签该怎么写,怎么对应,我分开写只能显示一个,写在那一行的内容又都会显示

@inbigtoiletboy
Copy link
Author

这个dog.name文件是用来存放标签的。

在训练前,我们会编辑一个*.yaml文件(以下是用来识别中国象棋的)
image

names: [ 'R_che', 
'R_ma',
'R_xiang',
'R_shi',
'R_shuai',
'R_pao',
'R_zu',
'B_che',
'B_ma',
'B_xiang',
'B_shi',
'B_shuai',
'B_pao',
'B_zu' ]

此处的names的顺序要与*.name内的标签顺序一致
image

@xiaoer99015
Copy link

这个dog.name文件是用来存放标签的。

在训练前,我们会编辑一个*.yaml文件(以下是用来识别中国象棋的) image

names: [ 'R_che', 
'R_ma',
'R_xiang',
'R_shi',
'R_shuai',
'R_pao',
'R_zu',
'B_che',
'B_ma',
'B_xiang',
'B_shi',
'B_shuai',
'B_pao',
'B_zu' ]

此处的names的顺序要与*.name内的标签顺序一致 image

好的,谢谢大佬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants