Rethinking Attentive Object Detection via Neural Attention Learning

被引:4
|
作者
Ge, Chongjian [1 ]
Song, Yibing [2 ]
Ma, Chao [3 ]
Qi, Yuankai [4 ]
Luo, Ping [1 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Fudan Univ, Inst AI3, Shanghai 200433, Peoples R China
[3] Shanghai Jiao Tong Univ, AI Inst, Shanghai 200240, Peoples R China
[4] Univ Adelaide, Australian Inst Machine Learning, Adelaide, SA 5005, Australia
关键词
Object detection; visual attention; neural attention learning;
D O I
10.1109/TIP.2023.3251693
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual attention advances object detection by attending neural networks to object representations. While existing methods incorporate empirical modules to empower network attention, we rethink attentive object detection from the network learning perspective in this work. We propose a NEural Attention Learning approach (NEAL) which consists of two parts. During the back-propagation of each training iteration, we first calculate the partial derivatives (a.k.a. the accumulated gradients) of the classification output with respect to the input features. We refine these partial derivatives to obtain attention response maps whose elements reflect the contributions to the final network predictions. Then, we formulate the attention response maps as extra objective functions, which are combined together with the original detection loss to train detectors in an end-to-end manner. In this way, we succeed in learning an attentive CNN model without introducing additional network structures. We apply NEAL to the two-stage object detection frameworks, which are usually composed of a CNN feature backbone, a region proposal network (RPN), and a classifier. We show that the proposed NEAL not only helps the RPN attend to objects but also enables the classifier to pay more attention to the premier positive samples. To this end, the localization (proposal generation) and classification mutually benefit from each other in our proposed method. Extensive experiments on large-scale benchmark datasets, including MS COCO 2017 and Pascal VOC 2012, demonstrate that the proposed NEAL algorithm advances the two-stage object detector over state-of-the-art approaches.
引用
收藏
页码:1726 / 1739
页数:14
相关论文
共 50 条
  • [1] Video salient object detection via spatiotemporal attention neural networks
    Tang, Yi
    Zou, Wenbin
    Hua, Yang
    Jin, Zhi
    Li, Xia
    [J]. NEUROCOMPUTING, 2020, 377 : 27 - 37
  • [2] Rethinking Self-Attention for Multispectral Object Detection
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Prendinger, Helmut
    Sidibe, Desire
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
  • [3] Residual attentive feature learning network for salient object detection
    Zhang, Qing
    Shi, Yanjiao
    Zhang, Xueqin
    Zhang, Liqian
    [J]. NEUROCOMPUTING, 2022, 501 : 741 - 752
  • [4] Ventral-Dorsal Neural Networks: Object Detection via Selective Attention
    Ebrahimpour, Mohammad K.
    Li, Jiayun
    Yu, Yen-Yun
    Reese, Jackson L.
    Moghtaderi, Azadeh
    Yang, Ming-Hsuan
    [J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 986 - 994
  • [5] Attentive Contexts for Object Detection
    Li, Jianan
    Wei, Yunchao
    Liang, Xiaodan
    Dong, Jian
    Xu, Tingfa
    Feng, Jiashi
    Yan, Shuicheng
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (05) : 944 - 954
  • [6] Recurrent Attention for Deep Neural Object Detection
    Symeonidis, Georgios
    Tefas, Anastasios
    [J]. 10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,
  • [7] MSFFAL: Few-Shot Object Detection via Multi-Scale Feature Fusion and Attentive Learning
    Zhang, Tianzhao
    Sun, Ruoxi
    Wan, Yong
    Zhang, Fuping
    Wei, Jianming
    [J]. SENSORS, 2023, 23 (07)
  • [8] Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks
    Wang, Wenguan
    Lu, Xiankai
    Shen, Jianbing
    Crandall, David
    Shao, Ling
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9235 - 9244
  • [9] Improved object detection via large kernel attention
    Wang, Zhaoxun
    Li, Yushan
    Liu, Yang
    Meng, Fanyu
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 240
  • [10] ATTENTIVE LAYER SEPARATION FOR OBJECT CLASSIFICATION AND OBJECT LOCALIZATION IN OBJECT DETECTION
    Kim, Jung Uk
    Ro, Yong Man
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3995 - 3999