Attentive Contexts for Object Detection

被引:181
|
作者
Li, Jianan [1 ]
Wei, Yunchao [2 ]
Liang, Xiaodan [3 ]
Dong, Jian [4 ]
Xu, Tingfa [1 ]
Feng, Jiashi [4 ]
Yan, Shuicheng [4 ]
机构
[1] Beijing Inst Technol, Sch Opt Engn, Beijing 100081, Peoples R China
[2] Beijing Jiaotong Univ, Beijing 100044, Peoples R China
[3] Sun Yat Sen Univ, Guangzhou 510006, Guangdong, Peoples R China
[4] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 119077, Singapore
关键词
Context; neural networks; object detection;
D O I
10.1109/TMM.2016.2642789
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern deep neural network-based object detection methods typically classify candidate proposals using their interior features. However, global and local surrounding contexts that are believed to be valuable for object detection are not fully exploited by existing methods yet. In this work, we take a step towards understanding what is a robust practice to extract and utilize contextual information to facilitate object detection in practice. Specifically, we consider the following two questions: "how to identify useful global contextual information for detecting a certain object?" and "how to exploit local context surrounding a proposal for better inferring its contents?" We provide preliminary answers to these questions through developing a novel attention to context convolution neural network (AC-CNN)-based object detection model. AC-CNN effectively incorporates global and local contextual information into the region-based CNN (e.g., fast R-CNN and faster R-CNN) detection framework and provides better object detection performance. It consists of one attention-based global contextualized (AGC) subnetwork and one multi-scale local contextualized (MLC) subnetwork. To capture global context, the AGC subnetwork recurrently generates an attention map for an input image to highlight useful global contextual locations, through multiple stacked long short-term memory layers. For capturing surrounding local context, the MLC subnetwork exploits both the inside and outside contextual information of each specific proposal at multiple scales. The global and local context are then fused together for making the final decision for detection. Extensive experiments on PASCAL VOC 2007 and VOC 2012 well demonstrate the superiority of the proposed AC-CNN over well-established baselines.
引用
收藏
页码:944 / 954
页数:11
相关论文
共 50 条
  • [21] An Improved Lightweight Network Using Attentive Feature Aggregation for Object Detection in Autonomous Driving
    Kalgaonkar, Priyank
    El-Sharkawy, Mohamed
    [J]. JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2023, 13 (03)
  • [22] SAMNet: Stereoscopically Attentive Multi-Scale Network for Lightweight Salient Object Detection
    Liu, Yun
    Zhang, Xin-Yu
    Bian, Jia-Wang
    Zhang, Le
    Cheng, Ming-Ming
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3804 - 3814
  • [23] Object detection in agricultural contexts: A multiple resolution benchmark and comparison to human
    Wosner, Omer
    Farjon, Guy
    Bar-Hillel, Aharon
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 189
  • [24] A2SPPNet: Attentive Atrous Spatial Pyramid Pooling Network for Salient Object Detection
    Qiu, Yu
    Liu, Yun
    Chen, Yanan
    Zhang, Jianwen
    Zhu, Jinchao
    Xu, Jing
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1991 - 2006
  • [25] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Fang Qingyun
    Wang Zhaokui
    [J]. PATTERN RECOGNITION, 2022, 130
  • [26] AMVFNet: Attentive Multi-View Fusion Network for 3D Object Detection
    Huang, Yuxiao
    Huang, Zhicong
    Zhao, Jingwen
    Hu, Haifeng
    Chen, Dihu
    [J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2024, 22 (01)
  • [27] CTOD: Cross-Attentive Task-Alignment for One-Stage Object Detection
    Yao, Ruilin
    Rong, Yi
    Huang, Qiangqiang
    Xiong, Shengwu
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (11) : 11507 - 11520
  • [28] Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
    Qingyun, Fang
    Zhaokui, Wang
    [J]. Pattern Recognition, 2022, 130
  • [29] Ann Quin, object relations, and the (in)attentive reader
    Powell, Josh
    [J]. TEXTUAL PRACTICE, 2021, 35 (02) : 247 - 263
  • [30] Pre-attentive and attentive object representations across saccades for saccade targets and bystanders
    Germeys, F
    Verfaillie, K
    [J]. PERCEPTION, 2004, 33 : 51 - 51