NMS python



Nms Python



最大でない抑制:

次のようになります(非最大抑制が2000x20の入力行列であると仮定すると、数値2000は画像の上部フレームを示し、20はクラスの数を示します)。

①〜2000×各列の20次元行列は降順でソートされます(各列は20クラスのクラスを表し、図に2つあるため、同じクラスに複数のターゲットがある場合があります)。



②各行の最大スコアが提案ボックスを開始し、それぞれIOUが列の後ろのフレームの推奨スコアを計算します。IoU>しきい値の場合、スコア推奨ボックスが小さいか、画像内の同じタイプのオブジェクトが複数のターゲット [ 2つの類似した提案ボックスのターゲットは実質的に重複していないため、提案ボックスを削除すると、実際には大きな重複が実現され、目標を持つ重複ボックスが削除されます。 ]

③各列の大きなブロックから推奨のスコアリングを開始し、手順②を繰り返します。



③④すべての列が提案ボックスを通過するまで手順を繰り返します

⑤すべての列の2000×20次元マトリックスをトラバースしました。つまり、すべてのオブジェクトタイプが非最大抑制です。もう一度実行してください。

import numpy as np import random import cv2 def non_max_suppress(predicts_dict, threshold): for object_name, bbox in predicts_dict.items (): # for each category separately NMS once read one pair of keys (that is, all the boxes a category) bbox_array = np.array(bbox, dtype=np.float) # The following were acquired frame Upper left coordinates (x1, y1), the bottom right coordinates (x2, y2) and the Confidence of this box It should be noted that the upper left corner of the image can be regarded as a coordinate point (0,0), the right can be seen as the angular coordinate point (1,1), that is to say from left to right increases the value of x, y values ​​increasing from top to bottom x1 = bbox_array[:, 0] y1 = bbox_array[:, 1] x2 = bbox_array[:, 2] y2 = bbox_array[:, 3] scores = bbox_array[:, 4] #argsort function returns an array of small to large values ​​of the index value, [:: --1] represents negated. I.e., the value returned here is the index of the array of values ​​in descending order = scores.argsort()[::-1] # Current class all boxes area (Python automatically uses a broadcast mechanism, i.e. equivalent in MATLAB * two multiplication matrix element corresponds.) X1 = 3, x2 = 5, the calculated x-direction length is customary x = 3, 4,5 three pixels, i.e., 5-3 + 1 = 3, instead of 5-3 = 2, it is necessary to add 1 areas = (x2 - x1 + 1) * (y2 - y1 + 1) keep = [] # Low confidence traversed by bbx, IoU remove all values ​​greater than the threshold of the rectangular frame of a rectangular frame while order.size > 0: i = order[0] keep.append (i) # bbx current retention index corresponding to the maximum confidence # Get all the current intersection corresponding left and right corners bbx coordinates, and calculate IoU (note that this is also calculated a bbx IoU with all the other bbx) xx1 = np.maximum(x1[i], x1[order[1:]]) # Greatest confidence left corner are respectively compared with all the rest of the left corner of the frame, a larger value were preserved therefore herein xx1 dimension should be the current frame number minus one class yy1 = np.maximum(y1[i], y1[order[1:]]) xx2 = np.minimum(x2[i], x2[order[1:]]) yy2 = np.minimum(y2[i], y2[order[1:]]) inter = np.maximum(0.0, xx2-xx1+1) * np.maximum(0.0, yy2-yy1+1) iou = inter / (areas[i] + areas[order[1:]] - inter) # Note here are based on a broadcast mechanism, while computing the highest degree of confidence the rest of the frame box IoU inds = np.where(iou <= threshold)[0] Reserved # iou less threshold value of the block index order = order[inds + 1] # The value of a first inds + order to reassign the Order retained index is updated, because of adding 1 is not calculated because the IOU with its own, so the index difference of 1, the need to add bbox = bbox_array[keep] predicts_dict[object_name] = bbox.tolist() #predicts_dict = predicts_dict return predicts_dict