CAM R-CNN: End-to-End Object Detection with Class Activation Maps

被引:4
|
作者
Zhang, Shengchuan [1 ]
Yu, Songlin [1 ]
Ding, Haixin [1 ]
Hu, Jie [1 ]
Cao, Liujuan [1 ]
机构
[1] Xiamen Univ, Sch Informat, Dept Artificial lntelligence, Xiamen 361005, Fujian, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Class activation maps; Attention mechanism; Transformer;
D O I
10.1007/s11063-023-11335-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class activation maps (CAMs) have been widely used on weakly-supervised object localization, which generate attention maps for specific categories in an image. Since CAMs can be obtained using category annotation, which is included in the annotation information of fully-supervised object detection. Therefore, how to adopt attention information in CAMs to improve the performance of fully-supervised object detection is an interesting problem. In this paper, we propose CAM R-CNN to deal with object detection, in which the category-aware attention maps provided by CAMs are integrated into the process of object detection. CAM R-CNN follows the common pipeline of the recent query-based object detectors in an end-to-end fashion, while two key CAM modules are embedded into the process. Specifically, E-CAM module provides embedding-level attention via fusing proposal features and attention information in CAMs with a transformer encoder, and S-CAM module supplies spatial-level attention by multiplying feature maps with the top-activated attention map provided by CAMs. In our experiments, CAM R-CNN demonstrates its superiority compared to several strong baselines on the challenging COCO dataset. Furthermore, we show that S-CAM module can be applied to two-stage detectors such as Faster R-CNN and Cascade R-CNN with consistent gains.
引用
收藏
页码:10483 / 10499
页数:17
相关论文
共 50 条
  • [1] CAM R-CNN: End-to-End Object Detection with Class Activation Maps
    Shengchuan Zhang
    Songlin Yu
    Haixin Ding
    Jie Hu
    Liujuan Cao
    [J]. Neural Processing Letters, 2023, 55 : 10483 - 10499
  • [2] Sparse R-CNN: An End-to-End Framework for Object Detection
    Sun, Peize
    Zhang, Rufeng
    Jiang, Yi
    Kong, Tao
    Xu, Chenfeng
    Zhan, Wei
    Tomizuka, Masayoshi
    Yuan, Zehuan
    Luo, Ping
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15650 - 15664
  • [3] Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
    Sun, Peize
    Zhang, Rufeng
    Jiang, Yi
    Kong, Tao
    Xu, Chenfeng
    Zhan, Wei
    Tomizuka, Masayoshi
    Li, Lei
    Yuan, Zehuan
    Wang, Changhu
    Luo, Ping
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14449 - 14458
  • [4] End-to-End Object Detection by Sparse R-CNN With Hybrid Matching in Complex Traffic Scenes
    Han, Xue-juan
    Qu, Zhong
    Wang, Shi-Yan
    Xia, Shu-Fang
    Wang, Sheng-Ye
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 512 - 525
  • [5] Cyclone detection with end-to-end super resolution and faster R-CNN
    Moustafa, Marwa S.
    Metwalli, Mohamed R.
    Samshitha, Roy
    Mohamed, Sayed A.
    Shovan, Barma
    [J]. EARTH SCIENCE INFORMATICS, 2024, 17 (03) : 1837 - 1850
  • [6] PolyR-CNN: R-CNN for end-to-end polygonal building outline extraction
    Jiao, Weiqin
    Persello, Claudio
    Vosselman, George
    [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2024, 218 : 33 - 43
  • [7] Panoptic Segmentation with an End-to-End Cell R-CNN for Pathology Image Analysis
    Zhang, Donghao
    Song, Yang
    Liu, Dongnan
    Jia, Haozhe
    Liu, Siqi
    Xia, Yong
    Huang, Heng
    Cai, Weidong
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2018, PT II, 2018, 11071 : 237 - 244
  • [8] End-to-End Object Detection with YOLOF
    Xi, Xing
    Huang, Yangyang
    Wu, Weiye
    Luo, Ronghua
    [J]. ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VII, ICIC 2024, 2024, 14868 : 101 - 112
  • [9] Oriented R-CNN for Object Detection
    Xie, Xingxing
    Cheng, Gong
    Wang, Jiabao
    Yao, Xiwen
    Han, Junwei
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3500 - 3509
  • [10] R-CNN for Small Object Detection
    Chen, Chenyi
    Liu, Ming-Yu
    Tuzel, Oncel
    Xiao, Jianxiong
    [J]. COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 : 214 - 230