Multi-Scale Attention Deep Neural Network for Fast Accurate Object Detection

被引:52
|
作者
Song, Kaiyou [1 ]
Yang, Hua [1 ]
Yin, Zhouping [1 ]
机构
[1] Huazhong Univ Sci & Technol, State Key Lab Digital Mfg Equipment & Technol, Sch Mech Sci & Engn, Wuhan 430074, Hubei, Peoples R China
基金
美国国家科学基金会;
关键词
Object detection; attention model; feature fusion; deep neural network;
D O I
10.1109/TCSVT.2018.2875449
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Object detection remains a challenging task in computer vision due to the tremendous extent of changes in the appearances of objects caused by clustered backgrounds, occlusion, truncation, and scale change. Current deep neural network (DNN)-based object detection methods cannot simultaneously achieve a high accuracy and a high efficiency. To overcome this limitation, in this paper, we propose a novel multi-scale attention (MSA) DNN for accurate object detection with high efficiency. The proposed MSA-DNN method utilizes a novel multi-scale feature fusion module (MSFFM) to construct high-level semantic features. Subsequently, a novel MSA module (MSAM) based on the fused layers of the MSFFM is introduced to exploit the global semantic information of image-level labels to guide detection. On the one hand, MSAM can capture global semantic information to further enhance the semantic feature representation of the fused layers constructed by the MSFFM, thereby improving the detection accuracy. On the other hand, the MSA maps generated by MSAM can be employed to rapidly and coarsely locate objects at different scales. In addition, an attention-based hard negative mining strategy is introduced to filter out negative samples to reduce the search space, dramatically alleviating the severe class imbalance problem. Extensive experimental results on the challenging PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO datasets demonstrate that MSA-DNN achieves a state-of-the-art detection accuracy while maintaining a high efficiency. Furthermore, MSA-DNN significantly improves the small-object detection accuracy.
引用
收藏
页码:2972 / 2985
页数:14
相关论文
共 50 条
  • [1] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
    Cai, Zhaowei
    Fan, Quanfu
    Feris, Rogerio S.
    Vasconcelos, Nuno
    [J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 354 - 370
  • [2] Multi-scale deep neural network for salient object detection
    Xiao, Fen
    Deng, Wenzheng
    Peng, Liangchan
    Cao, Chunhong
    Hu, Kai
    Gao, Xieping
    [J]. IET IMAGE PROCESSING, 2018, 12 (11) : 2036 - 2041
  • [3] Attention to the Scale : Deep Multi-Scale Salient Object Detection
    Zhang, Jing
    Dai, Yuchao
    Li, Bo
    He, Mingyi
    [J]. 2017 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING - TECHNIQUES AND APPLICATIONS (DICTA), 2017, : 105 - 111
  • [4] Multi-scale salient object detection network combining an attention mechanism
    Liu, Di
    Guo, Jichang
    Wang, Yudong
    Zhang, Yi
    [J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2022, 49 (04): : 118 - 126
  • [5] Pyramid attention object detection network with multi-scale feature fusion
    Chen, Xiu
    Li, Yujie
    Nakatoh, Yoshihisa
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 104
  • [6] MADNN: A Multi-scale Attention Deep Neural Network for Arrhythmia Classification
    Duan, Ran
    He, Xiaodong
    Ouyang, Zhuoran
    [J]. 2020 COMPUTING IN CARDIOLOGY, 2020,
  • [7] MDFN: Multi-scale deep feature learning network for object detection
    Ma, Wenchi
    Wu, Yuanwei
    Cen, Feng
    Wang, Guanghui
    [J]. PATTERN RECOGNITION, 2020, 100
  • [8] Multi-scale coupled attention for visual object detection
    Li, Fei
    Yan, Hongping
    Shi, Linsu
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [9] Multi-Scale Feature Attention-DEtection TRansformer: Multi-Scale Feature Attention for security check object detection
    Sima, Haifeng
    Chen, Bailiang
    Tang, Chaosheng
    Zhang, Yudong
    Sun, Junding
    [J]. IET COMPUTER VISION, 2024, 18 (05) : 613 - 625
  • [10] Multi-level and multi-scale deep saliency network for salient object detection
    Zhang, Qing
    Lin, Jiajun
    Zhuge, Jingling
    Yuan, Wenhao
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 415 - 424