问小白 wenxiaobai
资讯
历史
科技
环境与自然
成长
游戏
财经
文学与艺术
美食
健康
家居
文化
情感
汽车
三农
军事
旅行
运动
教育
生活
星座命理

CARAFE轻量级上采样算子在YOLOv5/YOLOv8中的应用

创作时间:
作者:
@小白创作中心

CARAFE轻量级上采样算子在YOLOv5/YOLOv8中的应用

引用
CSDN
1.
https://blog.csdn.net/qq_32575047/article/details/141158837

本文介绍了一种轻量级上采样算子CARAFE,可以将其应用到YOLOv5和YOLOv8模型中,以提升模型性能。通过替换模型中的最近邻上采样算子,CARAFE能够聚合上下文信息,增强模型对小目标的检测能力。本文提供了详细的代码实现和实验步骤。

1. CARAFE算子简介

CARAFE(Contextual Attention-based Resizing by Feature Enhancement)是一种轻量级上采样算子,具有以下特点:

  1. 大感受野:与传统的双线性插值等方法不同,CARAFE可以在一个大的接收域中聚合上下文信息。
  2. 内容感知:CARAFE不是为所有样本使用固定的内核,而是支持特定于实例的内容感知处理,可以动态生成自适应的内核。
  3. 轻量级、计算速度快:CARAFE引入的计算开销很小,可以很容易地集成到现有的网络架构中。

其结构如下图所示:

2. CARAFE在YOLOv5/YOLOv8中的应用

2.1 YOLOv5融合CARAFE算子

首先,需要新建一个yolov5-CARAFE.yaml文件,将CARAFE算子替换Neck部分的两个上采样部分。以下是yaml文件的源代码:

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 4  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:
- [10,13, 16,30, 33,23]  # P3/8  小目标
- [30,61, 62,45, 59,119]  # P4/16 中目标
- [116,90, 156,198, 373,326]  # P5/32  大目标
# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2  output_channel, kernel_size, stride, padding
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]
# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, CARAFE, [3, 5]],
   #[-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, CARAFE, [3, 5]],
   #[-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)
   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)
   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)
  
   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

接下来,需要将CARAFE算子的源码添加到项目中。新建一个CARAFE.py文件,存放源代码:

import torch
from torch import nn
from models.common import Conv

class CARAFE(nn.Module):
    def __init__(self, c, k_enc=3, k_up=5, c_mid=64, scale=2):
        """ The unofficial implementation of the CARAFE module.
        The details are in "https://arxiv.org/abs/1905.02188".
        Args:
            c: The channel number of the input and the output.
            c_mid: The channel number after compression.
            scale: The expected upsample scale.
            k_up: The size of the reassembly kernel.
            k_enc: The kernel size of the encoder.
        Returns:
            X: The upsampled feature map.
        """
        super(CARAFE, self).__init__()
        self.scale = scale
        self.comp = Conv(c, c_mid)
        self.enc = Conv(c_mid, (scale * k_up) ** 2, k=k_enc, act=False)
        self.pix_shf = nn.PixelShuffle(scale)
        self.upsmp = nn.Upsample(scale_factor=scale, mode='nearest')
        self.unfold = nn.Unfold(kernel_size=k_up, dilation=scale,
                                padding=k_up // 2 * scale)
    def forward(self, X):
        b, c, h, w = X.size()
        h_, w_ = h * self.scale, w * self.scale
        W = self.comp(X)  # b * m * h * w
        W = self.enc(W)  # b * 100 * h * w
        W = self.pix_shf(W)  # b * 25 * h_ * w_
        W = torch.softmax(W, dim=1)  # b * 25 * h_ * w_
        X = self.upsmp(X)  # b * c * h_ * w_
        X = self.unfold(X)  # b * 25c * h_ * w_
        X = X.view(b, c, -1, h_, w_)  # b * 25 * c * h_ * w_
        X = torch.einsum('bkhw,bckhw->bchw', [W, X])  # b * c * h_ * w_
        return X

然后,在yolo.py文件中引入CARAFE算子:

elif m is CARAFE:
    c2 = ch[f]
    args = [c2, *args]

最后,修改train.py启动文件,配置文件改为yolov5-CARAFE.yaml

2.2 YOLOv8融合CARAFE算子

对于YOLOv8,同样需要新建一个yolov8-CARAFE.yaml文件,将CARAFE算子替换Neck部分的两个上采样部分。以下是yaml文件的源代码:

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
# Parameters
nc: 10 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 6, C2f, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 3, C2f, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9
# YOLOv8.0n head
head:
- [-1, 1, CARAFE, []]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 3, C2f, [512]] # 12
- [-1, 1, CARAFE, []]
- [[-1, 4], 1, Concat, [1]] # cat backbone P3
- [-1, 3, C2f, [256]] # 15 (P3/8-small)
- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 12], 1, Concat, [1]] # cat head P4
- [-1, 3, C2f, [512]] # 18 (P4/16-medium)
- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 9], 1, Concat, [1]] # cat head P5
- [-1, 3, C2f, [1024]] # 21 (P5/32-large)
- [[15, 18, 21], 1, Detect, [nc]] # Detect(P3, P4, P5)

接下来,需要将CARAFE算子的源码添加到YOLOv8项目中。将源码拷贝到ultralytics/nn/modules/conv.py中,并在__init__.py中进行全局注册:

  1. conv.py__all__处注册CARAFE引用。
  2. __init__.py中的.conv中导入CARAFE算子,并在文件最下放的__all__处注册CARAFE算子。

最后,在task.py中引入CARAFE算子:

elif m is CARAFE:
    c1 = ch[f]
    args = [c1]

启动train.py时,将训练YOLOv8的模型文件改为YOLOv8n-CARAFE.yaml

3. 总结

本文介绍了如何将CARAFE轻量级上采样算子应用到YOLOv5和YOLOv8模型中,以提升模型性能。通过替换模型中的最近邻上采样算子,CARAFE能够聚合上下文信息,增强模型对小目标的检测能力。本文提供了详细的代码实现和实验步骤,读者可以根据需要进行尝试和优化。

© 2023 北京元石科技有限公司 ◎ 京公网安备 11010802042949号