您的位置：首頁 > 軟件教程 > 教程 > yolov5 篩選正樣本流程代碼多圖詳解

yolov5 篩選正樣本流程代碼多圖詳解

來源：好特整理　|　時間：2024-07-10 18:50:04 |　閱讀：178　|　標簽： Yolo v 代碼 OV 　 |　分享到：

正樣本全稱是anchor正樣本，正樣本所指的對象是anchor box，即先驗框。先驗框：YOLO v2吸收了Faster RCNN的優(yōu)點，設(shè)置了一定數(shù)量的預選框，使得模型不需要直接預測物體尺度與坐標，只需要預測先驗框到真實物體的偏移，降低了預測難度。

yolov5正樣本篩選原理

正樣本全稱是anchor正樣本，正樣本所指的對象是anchor box，即先驗框。
先驗框：從YOLO v2開始吸收了Faster RCNN的優(yōu)點，設(shè)置了一定數(shù)量的預選框，使得模型不需要直接預測物體尺度與坐標，只需要預測先驗框到真實物體的偏移，降低了預測難度。

正樣本獲取規(guī)則

Yolov5算法使用如下3種方式增加正樣本個數(shù)：

一、跨anchor預測

假設(shè)一個GT框落在了某個預測分支的某個網(wǎng)格內(nèi)，該網(wǎng)格具有3種不同大小anchor，若GT可以和這3種anchor中的多種anchor匹配，則這些匹配的anchor都可以來預測該GT框，即一個GT框可以使用多種anchor來預測。
具體方法：
不同于IOU匹配，yolov5采用基于寬高比例的匹配策略，GT的寬高與anchors的寬高對應(yīng)相除得到ratio1，anchors的寬高與GT的寬高對應(yīng)相除得到ratio2，取ratio1和ratio2的最大值作為最后的寬高比，該寬高比和設(shè)定閾值（默認為4）比較，小于設(shè)定閾值的anchor則為匹配到的anchor。

anchor_boxes=torch.tensor([[1.25000, 1.62500],[2.00000, 3.75000],[4.12500, 2.87500]])
gt_box=torch.tensor([5,4])

ratio1=gt_box/anchor_boxes
ratio2=anchor_boxes/gt_box
ratio=torch.max(ratio1, ratio2).max(1)[0]
print(ratio)

anchor_t=4
res=ratio


 tensor([4.0000, 2.5000, 1.3913])
tensor([False,  True,  True])

 
  與 GT 相匹配的的 anchor 為 **anchor2 **和
  
   anchor3
  
  。
  

  
 
 
  二、跨grid預測
 
 
  假設(shè)一個GT框落在了某個預測分支的某個網(wǎng)格內(nèi)，則該網(wǎng)格有左、上、右、下4個鄰域網(wǎng)格，根據(jù)GT框的中心位置，將最近的2個鄰域網(wǎng)格也作為預測網(wǎng)格，也即一個GT框可以由3個網(wǎng)格來預測。
  

  計算例子：
  

  
 
 
  GT box中心點處于grid1中，grid1被選中。為了增加增樣本，grid1的上下左右grid為候選網(wǎng)格，因為GT中心點更靠近grid2和grid3，grid2和grid3也作為匹配到的網(wǎng)格。
  

  根據(jù)上個步驟中的anchor匹配結(jié)果，GT與anchor2、anchor3相匹配，因此GT在當前層匹配到的正樣本有6個，分別為：
 
 
  
   grid1_anchor2，grid1_anchor3
  
  
   grid2_anchor2，grid2_anchor3
  
  
   grid3_anchor2，grid3_anchor3
  
 
 
  三、跨分支預測
 
 
  假設(shè)一個GT框可以和2個甚至3個預測分支上的anchor匹配，則這2個或3個預測分支都可以預測該GT框。即一個GT框可以在3個預測分支上匹配正樣本，在每一個分支上重復anchor匹配和grid匹配的步驟，最終可以得到某個GT 匹配到的所有正樣本。
  

  如下圖在Prediction的3個不同尺度的輸出中，gt都可以去匹配正樣本。
 
 
  
 
 
  正樣本篩選
 
 
  正樣本篩選主要做了四件事情：
 
 
  
   通過寬高比獲得合適的anchor
  
  
   通過anchor所在的網(wǎng)格獲得上下左右擴展網(wǎng)格
  
  
   獲取標注框相對網(wǎng)格左上角的偏移量
  
  
   返回獲得的anchor，網(wǎng)格序號，偏移量，類別等
  
 
 
  yolov5中anchor值
 
 anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

 
  yolov5的網(wǎng)絡(luò)有三個尺寸的輸出，不同大小的輸出對應(yīng)不同尺寸：
 
 
  
   8倍下采樣： [10,13, 16,30, 33,23]
  
  
   16倍下采樣：[30,61, 62,45, 59,119]
  
  
   32倍下采樣：[116,90, 156,198, 373,326]
  
 
 
  注釋代碼
 
 
  yolov5/utils/loss.py
 
     def build_targets(self, p, targets):
        # Build targets for compute_loss(), input targets(image,class,x,y,w,h)

        """
        p: 預測值
        targets：gt
        (Pdb) pp p[0].shape
        torch.Size([1, 3, 80, 80, 7])
        (Pdb) pp p[1].shape
        torch.Size([1, 3, 40, 40, 7])
        (Pdb) pp p[2].shape
        torch.Size([1, 3, 20, 20, 7])
        (Pdb) pp targets.shape
        torch.Size([23, 6])
        """
        na, nt = self.na, targets.shape[0]  # number of anchors, targets
        tcls, tbox, indices, anch = [], [], [], []
        
        """
        tcls    保存類別id
        tbox    保存的是gt中心相對于所在grid cell左上角偏移量。也會計算出gt中心相對擴展anchor的偏移量
        indices 保存的內(nèi)容是：image_id, anchor_id, grid x刻度  grid y刻度
        anch 保存anchor的具體寬高
        """
        
        gain = torch.ones(7, device=self.device)  # normalized to gridspace gain
        ai = torch.arange(na, device=self.device).float().view(na, 1).repeat(1, nt)  # same as .repeat_interleave(nt)
        """
        (Pdb) ai
        tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
                [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
                [2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]], device='cuda:0')
        (Pdb) ai.shape
        torch.Size([3, 23])
        """
        targets = torch.cat((targets.repeat(na, 1, 1), ai[..., None]), 2)  # append anchor indices

        g = 0.5  # bias
        off = torch.tensor(
            [
                [0, 0],
                [1, 0],
                [0, 1],
                [-1, 0],
                [0, -1],  # j,k,l,m
                # [1, 1], [1, -1], [-1, 1], [-1, -1],  # jk,jm,lk,lm
            ],
            device=self.device).float() * g  # offsets

        for i in range(self.nl):
            anchors, shape = self.anchors[i], p[i].shape
            """
            (Pdb) anchors
            tensor([[1.25000, 1.62500],
                    [2.00000, 3.75000],
                    [4.12500, 2.87500]], device='cuda:0')
            (Pdb) shape
            torch.Size([1, 3, 80, 80, 7])
            """
            gain[2:6] = torch.tensor(shape)[[3, 2, 3, 2]]  # xyxy gain
            """
            (Pdb) gain
            tensor([ 1.,  1., 80., 80., 80., 80.,  1.], device='cuda:0')
            """

            # Match targets to anchors
            t = targets * gain  # shape(3,n,7)  # 將grid cell還原到當前feature map上
            """
            (Pdb) t.shape
            torch.Size([3, 23, 7])
            """

            if nt:
                # Matches
                r = t[..., 4:6] / anchors[:, None]  # wh ratio
                j = torch.max(r, 1 / r).max(2)[0] < self.hyp['anchor_t']  # compare
                # j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t']  # iou(3,n)=wh_iou(anchors(3,2), gwh(n,2))
                t = t[j]  # filter
                """
                (Pdb) t.shape
                torch.Size([3, 23, 7]) -> torch.Size([62, 7])
                """

                # Offsets
                gxy = t[:, 2:4]  # grid xy
                gxi = gain[[2, 3]] - gxy  # inverse
                j, k = ((gxy % 1 < g) & (gxy > 1)).T
                """
                (Pdb) ((gxy % 1 < g) & (gxy > 1)).shape
                torch.Size([186, 2])
                (Pdb) ((gxy % 1 < g) & (gxy > 1)).T.shape
                torch.Size([2, 186])
                """
                l, m = ((gxi % 1 < g) & (gxi > 1)).T

                j = torch.stack((torch.ones_like(j), j, k, l, m))
                """
                torch.ones_like(j) 代表gt中心所在grid cell
                j, k, l, m 代表擴展的上下左右grid cell
                
                torch.Size([5, 51])
                """
                t = t.repeat((5, 1, 1))[j]
                """
                標簽也重復5次，和上面的擴展gird cell一起篩選出所有的，符合條件的grid cell
                (Pdb) pp t.shape
                torch.Size([153, 7])
                (Pdb) t.repeat((5, 1, 1)).shape
                torch.Size([5, 153, 7])
                (Pdb) pp t.shape
                torch.Size([232, 7])
                """
                offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j]

                """
                計算出所有g(shù)rid cell的偏移量，作用在標簽上之后就能得到最終的grid cell
                (Pdb) pp offsets.shape
                torch.Size([529, 2])
                """
            else:
                t = targets[0]
                offsets = 0


            # Define
            bc, gxy, gwh, a = t.chunk(4, 1)  # (image, class), grid xy, grid wh, anchors
            a, (b, c) = a.long().view(-1), bc.long().T  # anchors, image, class
            gij = (gxy - offsets).long()
            """
            用gt中心點的坐標減去偏移量，得到最終的grid cell的坐標。其中中心點也在。
            gxy 是在當前feature map下的gt中心點，如80*80下的 (55.09， 36.23)，減去偏移量，再取整就能得到一個grid cell的坐標，如 (55，36)
            Pdb) pp gij.shape
            torch.Size([529, 2])
            (Pdb) pp gij
            tensor([[ 9, 22],
                [ 2, 23],
                [ 6, 23],
                ...,
                [ 5, 19],
                [ 5, 38],
                [15, 36]], device='cuda:0')
            """
            gi, gj = gij.T  # grid indices

            # Append
            # indices 保存的內(nèi)容是：image_id, anchor_id（0，1，2）, grid x刻度  grid y刻度。這里的刻度就是正樣本
            indices.append((b, a, gj.clamp_(0, shape[2] - 1), gi.clamp_(0, shape[3] - 1)))  # image, anchor, grid

            # tbox保存的是gt中心相對于所在grid cell左上角偏移量。也會計算出gt中心相對擴展anchor的偏移量
            tbox.append(torch.cat((gxy - gij, gwh), 1))  # box
            """
            (Pdb) pp tbox[0].shape
                torch.Size([312, 4])
            (Pdb) pp tbox[0]
                tensor([[ 0.70904,  0.50893,  4.81701,  5.14418],
                        [ 0.28421,  0.45330,  3.58872,  4.42822],
                        [ 0.44398,  0.60475,  3.79576,  4.98174],
                        ...,
                        [ 0.59653, -0.37711,  3.97289,  4.44963],
                        [ 0.32074, -0.05419,  5.19988,  5.59987],
                        [ 0.28691, -0.38742,  5.79986,  6.66651]], device='cuda:0')
            (Pdb) gxy
                tensor([[ 9.19086, 22.46842],
                        [ 2.50407, 23.72271],
                        [ 6.35452, 23.75447],
                        ...,
                        [ 5.91273, 18.75906],
                        [ 5.16037, 37.97290],
                        [15.64346, 35.80629]], device='cuda:0')
                (Pdb) gij
                tensor([[ 9, 22],
                        [ 2, 23],
                        [ 6, 23],
                        ...,
                        [ 5, 19],
                        [ 5, 38],
                        [15, 36]], device='cuda:0')
                (Pdb) gxy.shape
                torch.Size([529, 2])
                (Pdb) gij.shape
                torch.Size([529, 2])
            """
            anch.append(anchors[a])  # anchors # 保存anchor的具體寬高
            tcls.append(c)  # class 保存類別id
            
            """
            (Pdb) pp anch[0].shape
                torch.Size([312, 2])
                (Pdb) pp tcls[0].shape
                torch.Size([312])
            """

        return tcls, tbox, indices, anch

 
  代碼基本思路
 
 
  
   傳入預測值和標注信息。預測值用于獲取當前操作的下采樣倍數(shù)
  
  
   遍歷每一種feature map，分別獲取正樣本數(shù)據(jù)
  
  
   獲取當前feature map的下采樣尺度，將歸一化的標注坐標還原到當前feature map的大小上
  
  
   計算gt和anchor的邊框長寬比，符合條件置為True，不符合條件置為False。過濾掉為False的anchor
  
  
   計算gt中心的xy和左上邊框距離和右下邊框距離，篩選出符合條件的grid cell，并計算出所有符合條件的anchor相對當前gt所在anchor偏移量
  
  
   通過上一步計算出來的偏移量和gt中心計算，得到所有anchor的坐標信息
  
  
   用gt所在偏移量減去grid cell的坐標信息，得到gt相對于所屬anchor左上角的偏移量。包括gt中心anchor和擴展anchor
  
  
   收集所有信息，包括：
  
 
 
  
   indices 保存的內(nèi)容是：image_id, anchor_id, grid x刻度  grid y刻度
  
  
   tbox    保存的是gt中心相對于所在grid cell左上角偏移量。也會計算出gt中心相對擴展anchor的偏移量
  
  
   anchors 保存anchor的具體寬高
  
  
   class   保存類別id
  
 
 
  準備工作
 
 
  在進入正樣本篩選之前，需要做一些準備工作，主要是獲取必要的參數(shù)。
 
 def build_targets(self, p, targets):
    pass 

 
  輸入的參數(shù)：
  

  targets 是這一批圖片的標注信息，每一行的內(nèi)容分別是：
  
   image, class, x, y, w, h。
  
 
 (Pdb) pp targets.shape
torch.Size([63, 6])

tensor([[0.00000, 1.00000, 0.22977, 0.56171, 0.08636, 0.09367],
        [0.00000, 0.00000, 0.06260, 0.59307, 0.07843, 0.08812],
        [0.00000, 0.00000, 0.15886, 0.59386, 0.06021, 0.06430],
        [0.00000, 0.00000, 0.31930, 0.58910, 0.06576, 0.09129],
        [0.00000, 0.00000, 0.80959, 0.70458, 0.23025, 0.26275],
        [1.00000, 1.00000, 0.85008, 0.07597, 0.09781, 0.11827],
        [1.00000, 0.00000, 0.22484, 0.09267, 0.14065, 0.18534]


 
  p 模型預測數(shù)據(jù)。主要用于獲取每一層的尺度
 
 (Pdb) pp p[0].shape
torch.Size([1, 3, 80, 80, 7])
(Pdb) pp p[1].shape
torch.Size([1, 3, 40, 40, 7])
(Pdb) pp p[2].shape
torch.Size([1, 3, 20, 20, 7])

 
  獲取anchor的數(shù)量和標注的數(shù)據(jù)的個數(shù)。設(shè)置一批讀入的數(shù)據(jù)為6張圖片，產(chǎn)生了66個標注框。
 
 na, nt = self.na, targets.shape[0]  # number of anchors, targets
tcls, tbox, indices, anch = [], [], [], []

 pp na
3
(Pdb) pp nt
66
(Pd

 
  targets保存的標注信息，首先將標注信息復制成三份，同時給每一份標注信息分配一個不同大小的anchor。
  
   相當于同一個標注框就擁有三個不同的anchor
  
  。
 
 
  在targets張量最后增加一個數(shù)據(jù)用于保存anchor的index。后續(xù)的篩選都是以單個anchor為顆粒度。targets 每一行內(nèi)容：
  
   image, class, x, y, w, h，anchor_id
  
 
 targets = torch.cat((targets.repeat(na, 1, 1), ai[..., None]), 2)
>>>
(Pdb) pp targets.shape
torch.Size([3, 63, 7])

 
  定義長寬比的比例g=0.5和擴展網(wǎng)格的選擇范圍off
 
 g = 0.5  # bias
off = torch.tensor(
    [
        [0, 0],
        [1, 0],
        [0, 1],
        [-1, 0],
        [0, -1],  # j,k,l,m
        # [1, 1], [1, -1], [-1, 1], [-1, -1],  # jk,jm,lk,lm
    ],
    device=self.device).float() * g  # offsets

 
  獲取正樣本anchor
 
 
  遍歷三種尺度，在每一種尺度上獲取正樣本anchor和擴展網(wǎng)格
  

  首先將標注框還原到當前尺度上。從傳入的預測數(shù)據(jù)中獲取尺度，如80 * 80，那么就是將中心點和寬高還原到80*80的尺度上，還原之前的尺度都是0-1之間歸一化處理的，還原之后范圍就是在0-80。
 
 anchors, shape = self.anchors[i], p[i].shape
"""
(Pdb) anchors
tensor([[1.25000, 1.62500],
        [2.00000, 3.75000],
        [4.12500, 2.87500]], device='cuda:0')
(Pdb) shape
torch.Size([1, 3, 80, 80, 7])
"""
gain[2:6] = torch.tensor(shape)[[3, 2, 3, 2]]  # xyxy gain
"""
(Pdb) gain
tensor([ 1.,  1., 80., 80., 80., 80.,  1.], device='cuda:0')
"""

# Match targets to anchors
t = targets * gain  # shape(3,n,7)  # 將grid cell還原到當前feature map上

 
  targets此時一行數(shù)據(jù)分別是：image_id, clss_id, 當前尺度下的x，當前尺度下的y，當前尺度下的寬，當前尺度下的高，當前尺度下的anchor_id。
 
 (Pdb) pp t.shape
torch.Size([3, 63, 7])
(Pdb) pp t[0,0]
tensor([ 0.00000,  1.00000, 18.38171, 44.93684,  6.90862,  7.49398,  0.00000], device='cuda:0')
(Pdb) pp t
tensor([[[ 0.00000,  1.00000, 18.38171,  ...,  6.90862,  7.49398,  0.00000],
         [ 0.00000,  0.00000,  5.00814,  ...,  6.27480,  7.04943,  0.00000],
         [ 0.00000,  0.00000, 12.70904,  ...,  4.81701,  5.14418,  0.00000],
         ...,
         [ 5.00000,  0.00000, 10.32074,  ...,  5.19988,  5.59987,  0.00000],
         [ 5.00000,  0.00000, 31.28691,  ...,  5.79986,  6.66651,  0.00000],
         [ 5.00000,  0.00000, 51.81977,  ...,  5.66653,  5.93320,  0.00000]],

        [[ 0.00000,  1.00000, 18.38171,  ...,  6.90862,  7.49398,  1.00000],
         [ 0.00000,  0.00000,  5.00814,  ...,  6.27480,  7.04943,  1.00000],
         [ 0.00000,  0.00000, 12.70904,  ...,  4.81701,  5.14418,  1.00000],
         ...,
         [ 5.00000,  0.00000, 10.32074,  ...,  5.19988,  5.59987,  1.00000],
         [ 5.00000,  0.00000, 31.28691,  ...,  5.79986,  6.66651,  1.00000],
         [ 5.00000,  0.00000, 51.81977,  ...,  5.66653,  5.93320,  1.00000]],

        [[ 0.00000,  1.00000, 18.38171,  ...,  6.90862,  7.49398,  2.00000],
         [ 0.00000,  0.00000,  5.00814,  ...,  6.27480,  7.04943,  2.00000],
         [ 0.00000,  0.00000, 12.70904,  ...,  4.81701,  5.14418,  2.00000],
         ...,
         [ 5.00000,  0.00000, 10.32074,  ...,  5.19988,  5.59987,  2.00000],
         [ 5.00000,  0.00000, 31.28691,  ...,  5.79986,  6.66651,  2.00000],
         [ 5.00000,  0.00000, 51.81977,  ...,  5.66653,  5.93320,  2.00000]]], device='cuda:0')


 
  
   yolov5 正樣本選取規(guī)則
  
  

  yolov5中正負樣本的計算規(guī)則是：比較標注框和anchor的寬高，比例在0.25-4以內(nèi)就是正樣本。如下圖所示：
  

  gt的原本面積為藍色，虛線標注了0.25倍和4倍。只要anchor在0.25-4之間，就是匹配成功。
 
 
  
 
 
  如果存在標注框，則計算anchor和標注框的寬高比
 
 if nt:
    # 獲取寬高比
    r = t[..., 4:6] / anchors[:, None]  

    # 獲取 寬高比或?qū)捀弑鹊箶?shù) 中最大的一個，和4比較。self.hyp['anchor_t'] = 4
    j = torch.max(r, 1 / r).max(2)[0] < self.hyp['anchor_t']  # compare

    # 將正樣本過濾出來
    t = t[j]  # filter

 
  此時t保存的就是所有符合條件的標注框，后續(xù)用于計算anchor和網(wǎng)格信息。這一階段的結(jié)束之后，輸出的是所有符合條件的anchor。t保存的是
  
   image, class, x, y, w, h，anchor_id，
  
  同一個圖片會對應(yīng)多個標注框，多個標注框可能會對應(yīng)多個anchor。
 
 
  
   跨anchor匹配
  
  

  r計算的過程中包含了跨anchor匹配。在準備工作中已經(jīng)介紹過將標注框復制了三份，每一份都分配了一個anchor，相當于一個標注框擁有三種不同大小的anchor。現(xiàn)在計算寬高比獲得的結(jié)果只要符合條件的都會認為是正樣本，3種anchor之間互不干擾，所以會出現(xiàn)一個標注框匹配多個anchor。
 
 (Pdb) pp t.shape
torch.Size([3, 63, 7])
(Pdb) pp t
tensor([[[ 0.00000,  1.00000, 18.38171,  ...,  6.90862,  7.49398,  0.00000],
         [ 0.00000,  0.00000,  5.00814,  ...,  6.27480,  7.04943,  0.00000],
         [ 0.00000,  0.00000, 12.70904,  ...,  4.81701,  5.14418,  0.00000],
         ...,
         [ 5.00000,  0.00000, 10.32074,  ...,  5.19988,  5.59987,  0.00000],
         [ 5.00000,  0.00000, 31.28691,  ...,  5.79986,  6.66651,  0.00000],
         [ 5.00000,  0.00000, 51.81977,  ...,  5.66653,  5.93320,  0.00000]],

        [[ 0.00000,  1.00000, 18.38171,  ...,  6.90862,  7.49398,  1.00000],
         [ 0.00000,  0.00000,  5.00814,  ...,  6.27480,  7.04943,  1.00000],
         [ 0.00000,  0.00000, 12.70904,  ...,  4.81701,  5.14418,  1.00000],
         ...,
         [ 5.00000,  0.00000, 10.32074,  ...,  5.19988,  5.59987,  1.00000],
         [ 5.00000,  0.00000, 31.28691,  ...,  5.79986,  6.66651,  1.00000],
         [ 5.00000,  0.00000, 51.81977,  ...,  5.66653,  5.93320,  1.00000]],

        [[ 0.00000,  1.00000, 18.38171,  ...,  6.90862,  7.49398,  2.00000],
         [ 0.00000,  0.00000,  5.00814,  ...,  6.27480,  7.04943,  2.00000],
         [ 0.00000,  0.00000, 12.70904,  ...,  4.81701,  5.14418,  2.00000],
         ...,
         [ 5.00000,  0.00000, 10.32074,  ...,  5.19988,  5.59987,  2.00000],
         [ 5.00000,  0.00000, 31.28691,  ...,  5.79986,  6.66651,  2.00000],
         [ 5.00000,  0.00000, 51.81977,  ...,  5.66653,  5.93320,  2.00000]]], device='cuda:0')
(Pdb) pp t[0,0]
tensor([ 0.00000,  1.00000, 18.38171, 44.93684,  6.90862,  7.49398,  0.00000], device='cuda:0')

 
  獲取擴展網(wǎng)格
 
 
  在yolov5中除了將gt中心點所在網(wǎng)格的anchor匹配為正樣本之外，還會將網(wǎng)格相鄰的上下左右四個網(wǎng)格中的對應(yīng)anchor作為正樣本。獲取擴展網(wǎng)格的規(guī)則就是根據(jù)中心點距離上下左右哪個更近來確定擴展的網(wǎng)格。如下圖中心點更靠近上和右，那么上和右網(wǎng)格中對應(yīng)的anchor就會成為正樣本。
 
 
  
 
 
  獲取擴展網(wǎng)格主要分為幾步走：
 
 
  
   獲取所有g(shù)t的中心點坐標gxy
  
  
   獲取中心點坐標相對于右下邊界的距離
  
  
   計算中心點距離上下左右哪兩個邊界更近
  
  
   獲取所有anchor所在的網(wǎng)格，包括gt中心點所在網(wǎng)格和擴展網(wǎng)格
  
 
 gxy = t[:, 2:4]  # grid xy
gxi = gain[[2, 3]] - gxy  # inverse
j, k = ((gxy % 1 < g) & (gxy > 1)).T
"""
(Pdb) ((gxy % 1 < g) & (gxy > 1)).shape
torch.Size([186, 2])
(Pdb) ((gxy % 1 < g) & (gxy > 1)).T.shape
torch.Size([2, 186])
"""
l, m = ((gxi % 1 < g) & (gxi > 1)).T

j = torch.stack((torch.ones_like(j), j, k, l, m))
"""
torch.ones_like(j) 代表gt中心所在grid cell
j, k, l, m 代表擴展的上下左右grid cell

torch.Size([5, 51])
"""
t = t.repeat((5, 1, 1))[j]
"""
標簽也重復5次，和上面的擴展gird cell一起篩選出所有的，符合條件的grid cell
(Pdb) pp t.shape
torch.Size([153, 7])
(Pdb) t.repeat((5, 1, 1)).shape
torch.Size([5, 153, 7])
(Pdb) pp t.shape
torch.Size([232, 7])
"""
offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j]
"""
計算出所有g(shù)rid cell的偏移量，作用在標簽上之后就能得到最終的grid cell
(Pdb) pp offsets.shape
torch.Size([529, 2])
"""

 
  gxy 是中心點的坐標，中心點坐標是相對于整個80*80網(wǎng)格的左上角(0,0)的距離，而gxi是80減去中心點坐標，得到的結(jié)果相當于是中心點距離(80,80)的距離。將中心點取余1之后相當于縮放到一個網(wǎng)格中，如上圖所示。
 
 
  
 
 gxy = t[:, 2:4]  # grid xy
gxi = gain[[2, 3]] - gxy  # inverse
j, k = ((gxy % 1 < g) & (gxy > 1)).T

 
  模擬以上操作，j,k得到的是一組布爾值
 
 >>> import torch
>>> 
>>> arr = torch.tensor([[1,2,3], [4,5,6]])
>>> one = arr % 2 < 2 
>>> two = arr > 3
>>> one
tensor([[True, True, True],
        [True, True, True]])
>>> two
tensor([[False, False, False],
        [ True,  True,  True]])
>>> one & two
tensor([[False, False, False],
        [ True,  True,  True]])

 
  距離的計算過程：
 
 j, k = ((gxy % 1 < g) & (gxy > 1)).T
"""
(Pdb) ((gxy % 1 < g) & (gxy > 1)).shape
torch.Size([186, 2])
(Pdb) ((gxy % 1 < g) & (gxy > 1)).T.shape
torch.Size([2, 186])
"""
l, m = ((gxi % 1 < g) & (gxi > 1)).T

 
  gxy % 1 < g 代表x或y離左上角距離小于0.5，小于0.5也就意味著靠的更近
  

  gxy > 1 代表x或y必須大于1，x必須大于1也就是說第一行的網(wǎng)格不能向上擴展；y必須大于1就是說第一列的網(wǎng)格不能向左擴展。
 
 
  
 
 
  同理gxi是相對下邊和右邊的距離，得到布爾張量。
 
 l, m = ((gxi % 1 < g) & (gxi > 1)).T

 
  獲取所有的正樣本網(wǎng)格結(jié)果
 
 j = torch.stack((torch.ones_like(j), j, k, l, m))
t = t.repeat((5, 1, 1))[j]

 
  j 保存上面擴展網(wǎng)格和中心點網(wǎng)格的匹配結(jié)果，是bool數(shù)組。torch.ones_like(j) 表示中心點匹配到的網(wǎng)格，jklm中保存的上下左右匹配的網(wǎng)格。
  

  t是將gt中心點的網(wǎng)格復制出來5份，用于計算所有網(wǎng)格。第一份是中心點匹配結(jié)果，剩余四份是上下左右網(wǎng)格匹配結(jié)果。
  

  用j來篩選t，最終留下所有選中的網(wǎng)格。
 
 
  計算出從中心點網(wǎng)格出發(fā)到擴展網(wǎng)格的需要的偏移量。后續(xù)使用使用該偏移量即可獲取所有網(wǎng)格，包括中心點網(wǎng)格和擴展網(wǎng)格。計算的過程中涉及到了廣播機制。
 
 offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j]

 
  示例如下：
 
 >>> off
tensor([[ 0,  0],
        [ 1,  0],
        [ 0,  1],
        [-1,  0],
        [ 0, -1]])
>>> arr = torch.tensor([10])
>>> 
>>> 
>>> arr + off
tensor([[10, 10],
        [11, 10],
        [10, 11],
        [ 9, 10],
        [10,  9]])

 
  以下圖為例，可視化正樣本anchor。
  

  經(jīng)過mosaic處理的圖片，藍色為標注框
  

  
 
 
  三種尺度下的正樣本網(wǎng)格
  

  
 
 
  
 
 
  
 
 
  三種尺度下的正樣本anchor
 
 
  
 
 
  
 
 
  
 
 
  三種尺度下原圖的正樣本網(wǎng)格
  

  
  

  
  

  
 
 
  三種尺度下原圖的anchor
 
 
  
 
 
  
 
 
  
 
 
  保存結(jié)果
 
 
  從t中獲取相關(guān)數(shù)據(jù)，包括：
 
 
  
   bc：image_id, class_id
  
  
   gxy: gt中心點坐標
  
  
   gwh: gt寬高
  
  
   a: anchor_id
  
 
 bc, gxy, gwh, a = t.chunk(4, 1)  # (image, class), grid xy, grid wh, anchors
a, (b, c) = a.long().view(-1), bc.long().T  # anchors, image, class
gij = (gxy - offsets).long()

 
  獲取所有正樣本網(wǎng)格：
 
 gij = (gxy - offsets).long()
gi, gj = gij.T  # grid indices

 
  gxy是gt中心點的坐標，減去對應(yīng)偏移量再取整， 得到所有正樣本所在網(wǎng)格。然后將xy拆分出來得到gi，gj。
 
 (Pdb) pp gij
tensor([[74, 24],
        [37, 28],
        [72,  9],
        [75, 11],
        [67,  5],
        [73,  5],
        [70,  5],
        [75,  1],
        ...)

 
  indices: 保存圖片，anchor，網(wǎng)格等信息
 
 # indices 保存的內(nèi)容是：image_id, anchor_id（0，1，2）, grid x刻度  grid y刻度。這里的刻度就是正樣本
indices.append((b, a, gj.clamp_(0, shape[2] - 1), gi.clamp_(0, shape[3] - 1)))  # image, anchor, grid

 (Pdb) pp a.shape
torch.Size([367])
(Pdb) pp gij.shape
torch.Size([367, 2])

 
  保存中心點偏移量
 
 # tbox保存的是gt中心相對于所在grid cell左上角偏移量。也會計算出gt中心相對擴展anchor的偏移量
tbox.append(torch.cat((gxy - gij, gwh), 1))  # box

 
  gij是網(wǎng)格起始坐標，gxy是gt中心點坐標。gxy-gij就是獲取gt中心點相對于網(wǎng)格左上角坐標的偏移量。
  

  
 
 
  在后續(xù)的損失函數(shù)計算中，用這個偏移量和網(wǎng)絡(luò)預測出來的偏移量計算損失函數(shù)。
 
 
  保存anchor具體的寬高和類別id
 
 anch.append(anchors[a])  # anchors # 保存anchor的具體寬高
tcls.append(c)  # class 保存類別id

 
  自此正樣本篩選的流程就結(jié)束了，最終返回了4個張量：
 
 
  
   
    indices
   
   保存的內(nèi)容是：image_id, anchor_id, grid x刻度  grid y刻度
  
  
   
    tbox
   
   保存的是gt中心相對于所在grid cell左上角偏移量。也會計算出gt中心相對擴展anchor的偏移量
  
  
   
    anchors
   
   保存anchor的具體寬高
  
  
   
    class
   
   保存類別id
  
 
 
  返回的正樣本anchor會在后續(xù)損失函數(shù)的計算中使用。用
  
   indices
   
    保存的網(wǎng)格篩選出模型輸出的中對應(yīng)的網(wǎng)格里的內(nèi)容，用
   
   tbox中中心點相對網(wǎng)格的偏移
  
  和
  
   模型輸出的預測中心點相對于網(wǎng)格左上角偏移量
  
  計算偏差，并不斷修正。
 
 
  Q&A
 
 
  
   一、正樣本指的是anchor，anchor匹配如何體現(xiàn)在過程？
  
  

  targets 是這一批圖片的標注信息，每一行的內(nèi)容分別是：
  
   image, class, x, y, w, h。
  
 
 (Pdb) pp targets.shape
torch.Size([63, 6])

tensor([[0.00000, 1.00000, 0.22977, 0.56171, 0.08636, 0.09367],
        [0.00000, 0.00000, 0.06260, 0.59307, 0.07843, 0.08812],
        [0.00000, 0.00000, 0.15886, 0.59386, 0.06021, 0.06430],
        [0.00000, 0.00000, 0.31930, 0.58910, 0.06576, 0.09129],
        [0.00000, 0.00000, 0.80959, 0.70458, 0.23025, 0.26275],
        [1.00000, 1.00000, 0.85008, 0.07597, 0.09781, 0.11827],
        [1.00000, 0.00000, 0.22484, 0.09267, 0.14065, 0.18534]


 targets = torch.cat((targets.repeat(na, 1, 1), ai[..., None]), 2)
>>>
(Pdb) pp targets.shape
torch.Size([3, 63, 7])

 
  targets保存的標注信息，首先將標注信息復制成三份，因為每一個尺度每一個網(wǎng)格上有三個anchor，
  
   相當于給一份標注框分配了一個anchor
  
  。
  

  在后續(xù)的操作中，先通過先將標注框還原到對應(yīng)的尺度上，通過寬高比篩選anchor，獲得符合正樣本的anchor。到這里就獲得所有正樣本的anchor。
  

  然后再通過中心點的坐標獲得擴展網(wǎng)格。
 
 j = torch.stack((torch.ones_like(j), j, k, l, m))
t = t.repeat((5, 1, 1))[j]

 
  此時將t復制5份，每一份的每一行內(nèi)容代表：
  
   image, class, x, y, w, h，anchor_id。
  
  

  復制的過程中就攜帶了anchor_id的信息，最終通過擴展獲取上下左右兩個網(wǎng)格，相當于獲得了兩個網(wǎng)格中的anchor。
  

  最后將所有的anchor保存起來，在計算損失函數(shù)時使用到anchor的兩個功能：
 
 
  
   使用這些anchor的寬高作為基準，模型輸出的結(jié)果是anchor寬高的比例
  
  
   anchor所在的網(wǎng)格為定位參數(shù)提供范圍。網(wǎng)絡(luò)輸出的xy是相對于網(wǎng)格左上角的偏移
  
 
 
  二、
  
   跨anchor匹配體現(xiàn)在哪里？
  
 
 
  targets保存的標注信息，首先將標注信息復制成三份，因為每一個尺度每一個網(wǎng)格上有三個anchor，
  
   相當于給一份標注框分配了一個anchor
  
  。
 
 r = t[..., 4:6] / anchors[:, None]  

# 獲取 寬高比或?qū)捀弑鹊箶?shù) 中最大的一個，和0.5比較
j = torch.max(r, 1 / r).max(2)[0] < self.hyp['anchor_t']  # compare

# 將正樣本過濾出來
t = t[j]  # filter

 
  r計算的過程中包含了跨anchor匹配。t是將原有的標注信息復制了三份，而每一個網(wǎng)格也有三個anchor，也就是說一份標注信息對應(yīng)一個anchor�，F(xiàn)在計算寬高比獲得的結(jié)果只要符合條件的都會認為是正樣本，3種anchor之間互不干擾。
  

  那么有可能存在的情況是三種anchor和gt的寬高比都符合條件，那么這3個標注數(shù)據(jù)都會保存下來，相應(yīng)的anchor都會成為正樣本。
 
 
  三、
  
   跨網(wǎng)格匹配體現(xiàn)在哪里？
  
 
 
  所謂跨網(wǎng)格匹配就是除了gt中心點所在網(wǎng)格，還會選擇擴展網(wǎng)格。
  

  
 
 
  擴展網(wǎng)格的篩選過程就是跨網(wǎng)格匹配的過程
 
 gxy = t[:, 2:4]  # grid xy
gxi = gain[[2, 3]] - gxy  # inverse
j, k = ((gxy % 1 < g) & (gxy > 1)).T
l, m = ((gxi % 1 < g) & (gxi > 1)).T
j = torch.stack((torch.ones_like(j), j, k, l, m))
t = t.repeat((5, 1, 1))[j]

 
  四、
  
   跨尺度匹配體現(xiàn)在哪里？
  
 
 
  一個標注框可以在不同的預測分支上匹配上anchor。anchor的匹配在不同的尺度上分開單獨處理，三個尺度互相不干擾，所以一個標注框最多能在三個尺度上都匹配上anchor。
 
 for i in range(self.nl):
    anchors, shape = self.anchors[i], p[i].shape
    ...
    indices.append((b, a, gj.clamp_(0, shape[2] - 1), gi.clamp_(0, shape[3] - 1)))  # image, anchor, grid

    # tbox保存的是gt中心相對于所在grid cell左上角偏移量。也會計算出gt中心相對擴展anchor的偏移量
    tbox.append(torch.cat((gxy - gij, gwh), 1))  # box

 
  可以看到以下三個不同尺度的anchor匹配中，右上角目標都匹配上了。
  

  
 
 
  
   五、擴展的網(wǎng)格中用哪一個anchor？
  
  

  通過寬高比篩選出來的正樣本才會被復制，也就是說一個網(wǎng)格中的anchor匹配上gt之后，然后才有可能被擴展網(wǎng)格選中。
  

  在擴展網(wǎng)格之前，就已經(jīng)篩選出正樣本，有一個確定大小的anchor。擴展網(wǎng)格的獲得過程是將正樣本復制5份。復制的過程就將中心點匹配的anchor_id攜帶過去。
 
 j = torch.stack((torch.ones_like(j), j, k, l, m))
t = t.repeat((5, 1, 1))[j]

 
  復制的是正樣本，那么擴展網(wǎng)格最終獲得的也是中心點所在網(wǎng)格上匹配好的anchor
  

  一個網(wǎng)格中有兩個anchor成為正樣本，那么擴展網(wǎng)格中就有兩個anchor為正樣本。擴展網(wǎng)格的anchor_id 和中心點網(wǎng)格保持一致。
 
 
  
   六、擴展網(wǎng)格中g(shù)t的偏移量如何計算？
  
  

  計算gt中心點相對于網(wǎng)格左上角的偏移量中有幾個變量：
 
 
  
   gxy： 中心點的坐標
  
  
   gij：網(wǎng)格的起始坐標
  
 
 gij = (gxy - offsets).long()

 
  gij 是通過中心點減去偏移量再取整獲得的
 
 # tbox保存的是gt中心相對于所在grid cell左上角偏移量。也會計算出gt中心相對擴展anchor的偏移量
tbox.append(torch.cat((gxy - gij, gwh), 1))  # box

 
  gxy - gij 的計算過程中，對于那些擴展的網(wǎng)格，也會同樣計算偏移量。所以擴展網(wǎng)格的偏移量就是網(wǎng)格的左上角到gt中心點的距離。



		
				小編推薦閱讀
		
				2025年如意西游游戲攻略與技巧大全最新
				QQ飛車手游賽車數(shù)據(jù)大揭秘
				云頂之弈s6（教你如何打造出完美的莎彌拉裝備）
				《以陰陽師櫻之憶為例，詳解如何預約櫻之憶》（預約方法、注意事項、常見問題講解）
				《洪荒文明俏羅成》技能玩法攻略秘籍（探索古老神話世界，成為文明創(chuàng)造者）


		      好特網(wǎng)發(fā)布此文僅為傳遞信息，不代表好特網(wǎng)認同期限觀點或證實其描述。
	        
       
        
       

	   
             
               相關(guān)視頻攻略
更多
               
			   
				                  
                
             
             
       
       
         同類最新
更多
         
		           
            使用Blender生成城市模型
            閱讀
          
                   
            全球氣象數(shù)據(jù)ERA5的下載方法
            閱讀
          
                   
            Xpath解析及其語法
            閱讀
          
                   
            機器學習：神經(jīng)網(wǎng)絡(luò)構(gòu)建（下）
            閱讀
          
                   
            華為Mate品牌盛典：HarmonyOS NEXT加持下游戲性能得到充分釋放
            閱讀
          
                   
            實現(xiàn)對象集合與DataTable的相互轉(zhuǎn)換
            閱讀
          
                   
            硬盤的基本知識與選購指南
            閱讀
          
                   
            如何在中國移動改變低價套餐
            閱讀
          
                   
            鴻蒙NEXT元服務(wù)：論如何免費快速上架作品
            閱讀
          
                   
            豐巢快遞小程序軟件廣告推薦關(guān)閉方法
            閱讀
          
                   
            如何在ArcMap軟件中進行柵格圖像重采樣操作
            閱讀
          
                   
            算法與數(shù)據(jù)結(jié)構(gòu) 1 - 模擬
            閱讀
          
                   
            升訊威在線客服與營銷系統(tǒng)介紹
            閱讀
          
                   
            騰訊視頻夜間模式設(shè)置教程
            閱讀
          
                   
            基于鴻蒙NEXT的血型遺傳計算器開發(fā)案例
            閱讀
          
                   
            5. Spring Cloud OpenFeign 聲明式 WebService 客戶端的超詳細使用
            閱讀
          
                   
            Java代理模式：靜態(tài)代理和動態(tài)代理的對比分析
            閱讀
          
                   
            Win11筆記本“自動管理應(yīng)用的顏色”顯示規(guī)則
            閱讀


    
      
        熱門資訊
更多
        
				          
            1
            QQ怎樣刷永久會員和6鉆的教程
          
						          
            2
            全彩無遮漫畫大全
          
						          
            3
            QQ群等級頭銜稱號
          
						          
            4
            蘇寧易購開送2020年6月新人滿30-30元無門檻神券
          
						          
            5
            百度網(wǎng)盤怎么搜索你懂的資源
          
						          
            6
            五線譜符號圖案大全
          
						          
            7
            roommate韓漫我的老師全集閱讀
          
						          
            8
            2017年最新款手機排行榜前十名
          
						          
            9
            qq賬號密碼共享介紹
          
						          
            10
            王者榮耀大神帳號密碼大全
          
																																												
          
        
      
		
           
             
               游戲推薦
               相關(guān)版本
               
             
             更多
           
           
             
			  
			  				
					
						
						飛天萌貓 V1.0.0 安卓版
					
				
							
					
						
						歐洲運輸卡車模擬器 V1.1 安卓版
					
				
							
					
						
						妄想山海乾坤 V2.0.2 安卓版
					
				
							
					
						
						重返帝國 V0.8.2.188 安卓版
					
				
							
					
						
						迷你世界 V1.9.0 安卓版
					
				
							
					
						
						賽車計劃Go V1.0.1 安卓版
					
				
			               
            
             
			 			  
			                 
            
           
          
          
           
             
               資訊
               視頻
               
             
             更多
           
           
             
			  

									 						 						 						 						 						 						 						 						 						 						  
				
					
				  2018好看的電影介紹
				
			  
			 			 						  
				
					
				  鋼筋直徑符號快捷輸入方法
				
			  
			 			 						  
				
					
				  2017年清宮圖生男生女表,生男生女早知道
				
			  
			 			 						  
				
					
				  2016年網(wǎng)絡(luò)游戲賺錢排行榜
				
			  
			 			 						 						 						 						 						 						               
            
             
			  
			                
            
           
          

      
       
	            安卓熱榜
		          蘋果熱榜
		          
       
       
	            
		 		
             思仙 V1.5.6 安卓版

             
           
				
             決戰(zhàn)瑪法 V7.5.0 安卓版

             
           
				
             �；ǖ馁N身高手天階島福利版 V1.0 安卓版

             
           
				
             閃爍之光 V1.9.7 安卓版

             
           
				
             群英之戰(zhàn) V1.0 安卓版

             
           
				
             仙語奇緣連抽版 v1.0.4 安卓版

             
           
				
             阿拉德之怒 V5.2.3  安卓版

             
           
				
             純?nèi)龂蜔o限資源 V1.0 安卓版

             
           
				
             天空之息 V1.0 安卓版

             
           
				
             伏魔訣（無限送真充） V1.0 安卓版

             
           
		         
                  
		 		
             魂斗羅歸來 V1.3.34.6820 蘋果版

             
           
				
             王者榮耀 V1.20.1.17 蘋果版

             
           
				
             超進化物語ios版 V1.1.0 蘋果版

             
           
				
             陰陽師 V1.0 IOS版

             
           
				
             軒轅傳奇手游 V1.0.30.1 蘋果版

             
           
				
             和平精英ios版 V1.1.16 蘋果版

             
           
				
             劍俠情緣手游iPhone版 V1.3.1 IOS版

             
           
				
             夢幻西游無雙2ios版 V1.3.6 蘋果版

             
           
				
             一起來捉妖 V1.4.648.1 蘋果版

             
           
				
             劍與家園IOS版 V1.14.4 蘋果版