问小白 wenxiaobai
资讯
历史
科技
环境与自然
成长
游戏
财经
文学与艺术
美食
健康
家居
文化
情感
汽车
三农
军事
旅行
运动
教育
生活
星座命理

目标检测算法国内外研究现状综述

创作时间:
作者:
@小白创作中心

目标检测算法国内外研究现状综述

引用
CSDN
1.
https://blog.csdn.net/Joejwu/article/details/131521981

目标检测是计算机视觉领域的重要研究方向,近年来在深度学习的推动下取得了显著进展。本文对目标检测算法的研究现状进行了全面综述,从传统方法到基于深度学习的Anchor based和Anchor Free算法,详细介绍了各类算法的原理、特点和性能表现。

传统目标检测算法

传统目标检测算法主要通过手工提取特征的方式进行目标检测,其基本流程包括候选区域选择、特征提取和分类。Viola Jones检测器通过积分图、AdaBoost分类器和级联结构等优化措施提高检测效率;HOG检测器通过局部像素块提取特征直方图,具有较好的光照和变形鲁棒性;DPM检测器在HOG检测器基础上叠加边框回归等技术,在VOC目标检测挑战赛中获得冠军。尽管这些传统算法在当时取得了不错的结果,但与基于深度学习的算法相比,在精度、计算量和检测速度等方面仍有较大差距。

基于深度学习的Anchor based两阶段目标检测算法

基于深度学习的目标检测算法主要分为Anchor based和Anchor free两大类。其中,Anchor based方案又可分为单阶段和两阶段检测算法。两阶段目标检测算法先从待检测图像中选择候选区域,再从候选区域中检测并生成目标边框。最早的基于CNN的两阶段目标检测算法是RCNN,通过选择搜索从候选框中选择可能包含物体的目标框,随后作为CNN模型的输入来提取特征,最终传输给SVM分类器进行判断。SPPNet采用空间金字塔池化层,避免重复计算;Fast RCNN在VOC 2007数据集上实现了70.0%的mAP;Faster RCNN创新性地提出了区域候选网络生成候选框,大幅提升检测速度;FPN通过横向连接的自上而下的结构,进一步提升检测精度;Cascade RCNN通过堆叠多个级联模块,采用不同IOU阈值进行训练;Grid RCNN将位置回归替换为关键点检测,实现SOTA效果。


图1.1 目标检测算法近20年来发展路线图

基于深度学习的Anchor based单阶段目标检测算法

单阶段目标检测算法直接产生检测结果,包括类别概率与边界框坐标。最早的单阶段检测器是YOLO v1,创新性地提出将图像划分为多个网格,对每个网格预测边界框与类别概率。YOLO v2和YOLO v3通过改进骨干网络和多分支检测不同尺度目标,提升检测精度与速度。SSD采用Multi-reference和Multi-resolution技术,RetinaNet引入Focal Loss解决类别不平衡问题。YOLO v4集成了同时期目标检测领域众多Tricks,YOLO v5在精度和速度上超越了YOLO v4,成为当前主流的单阶段检测器。最近的YOLO v6参考RepVGG中的思想,对骨干网络和Neck部分进行优化设计,实现70.0%的mAP。

基于深度学习的Anchor Free目标检测算法

Anchor based检测器存在对Anchor数量、大小和长宽比敏感等问题,Anchor Free检测器应运而生。CornerNet将目标边界框预测替换为关键点预测,CenterNet直接检测目标中心点坐标,FSAF提出新的结构用于特征金字塔网络训练,FCOS采用逐像素检测策略,SAPD提出软加权锚点与软选择金字塔层级策略,YOLOX实现Anchor Free并提出双检测头输出策略。这些算法在COCO数据集上均取得了优异的性能。


图1.3 Anchor Free目标检测网络结构示意图

参考文献

文中参考文献序号减去18即与下列相应文献对应!

  1. Viola P, Jones M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001. Ieee, 2001, 1: I-I.
  2. Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05). Ieee, 2005, 1: 886-893.
  3. Felzenszwalb P F, Girshick R B, McAllester D, et al. Object detection with discriminatively trained part-based models[J]. IEEE transactions on pattern analysis and machine intelligence, 2010, 32(9): 1627-1645.
  4. Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
  5. He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
  6. Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.
  7. Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.
  8. Lin T Y, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]//European conference on computer vision. Springer, Cham, 2014: 740-755.
  9. Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125.
  10. Cai Z, Vasconcelos N. Cascade r-cnn: Delving into high quality object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 6154-6162.
  11. Lu X, Li B, Yue Y, et al. Grid r-cnn[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 7363-7372.
  12. Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
  13. Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.
  14. Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
  15. Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//European conference on computer vision. Springer, Cham, 2016: 21-37.
  16. Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2980-2988.
  17. Glenn Jocher, Alex Stoken, Ayush Chaurasia, et al., 2021. Ultralytics/yolov5: v6.0 - yolov5n 'nano' models, roboflow integration, tensorflow export, opencv DNN support[Z]. Zenodo(2021–10–12).
  18. Li C, Li L, Jiang H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J]. arXiv preprint arXiv:2209.02976, 2022.
  19. Ding X, Zhang X, Ma N, et al. Repvgg: Making vgg-style convnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 13733-13742.
  20. Law H, Deng J. Cornernet: Detecting objects as paired keypoints[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 734-750.
  21. Duan K, Bai S, Xie L, et al. Centernet: Keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 6569-6578.
  22. Zhu C, He Y, Savvides M. Feature selective anchor-free module for single-shot object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 840-849.
  23. Tian Z, Shen C, Chen H, et al. Fcos: Fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 9627-9636.
  24. Zhu C, Chen F, Shen Z, et al. Soft anchor-point object detection[C]//European conference on computer vision. Springer, Cham, 2020: 91-107.
  25. Ge Z, Liu S, Wang F, et al. Yolox: Exceeding yolo series in 2021[J]. arXiv preprint arXiv:2107.08430, 2021.

本文原文来自CSDN,作者Joejwu

© 2023 北京元石科技有限公司 ◎ 京公网安备 11010802042949号