HA-Transformer: Harmonious aggregation from local to global for object detection

被引:4
|
作者
Chen, Yang [1 ]
Chen, Sihan [1 ]
Deng, Yongqiang [2 ]
Wang, Kunfeng [1 ]
机构
[1] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing 100029, Peoples R China
[2] VanJee Technol, Beijing 100193, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Transformer; multi-head self-attention; global interaction; transition module;
D O I
10.1016/j.eswa.2023.120539
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the Vision Transformer (ViT) with global modeling capability has shown its excellent performance in classification task, which innovates the development direction for a series of vision tasks. However, due to the enormous cost of multi-head self-attention, reducing computational cost while holding the capability of global interaction remains a big challenge. In this paper, we propose a new architecture by establishing an end-to-end connection from local to global via bridge tokens, so that the global interaction is completed at the window level, effectively solving the quadratic complexity problem of transformer. Besides, we consider a hierarchy of information from short-distance to long-distance, which adds a transition module from local to global to make a more harmonious aggregation of information. Our proposed method is named HA-Transformer. The experimental results on COCO dataset show excellent performance of HA-Transformer for object detection, outperforming several state-of-the-art methods.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] A survey: object detection methods from CNN to transformer
    Ershat Arkin
    Nurbiya Yadikar
    Xuebin Xu
    Alimjan Aysa
    Kurban Ubul
    Multimedia Tools and Applications, 2023, 82 : 21353 - 21383
  • [22] Camouflaged Object Detection Based on Feature Aggregation and Global Semantic Learning
    Wang, Kuan
    Li, Xiuhong
    Li, Boyuan
    Li, Songlin
    Wei, Zijun
    Wan, Lining
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XII, 2025, 15042 : 258 - 271
  • [23] Radar-camera fusion for 3D object detection with aggregation transformer
    Li, Jun
    Zhang, Han
    Wu, Zizhang
    Xu, Tianhao
    APPLIED INTELLIGENCE, 2024, 54 (21) : 10627 - 10639
  • [24] AHT: A Novel Aggregation Hyper-transformer for Few-Shot Object Detection
    Lai, Lanqing
    Yu, Yale
    Suo, Wei
    Wang, Peng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 43 - 55
  • [25] Motion-Guided Global-Local Aggregation Transformer Network for Precipitation Nowcasting
    Dong, Xichao
    Zhao, Zewei
    Wang, Yupei
    Wang, Jianping
    Hu, Cheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [26] CoupleNet: Coupling Global Structure with Local Parts for Object Detection
    Zhu, Yousong
    Zhao, Chaoyang
    Wang, Jinqiao
    Zhao, Xu
    Wu, Yi
    Lu, Hanqing
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4146 - 4154
  • [27] Integration of Local and Global Features for Anatomical Object Detection in Ultrasound
    Rahmatullah, Bahbibi
    Papageorghiou, Aris T.
    Noble, J. Alison
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2012, PT III, 2012, 7512 : 402 - 409
  • [28] Local and Global Collaboration for Object Detection Enhancement with Information Redundancy
    Lee, Jinseok
    Ryu, Junghun
    Hong, Sangjin
    Cho, We-Duke
    AVSS: 2009 6TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, 2009, : 358 - +
  • [29] Local and Global Information Exchange for Enhancing Object Detection and Tracking
    Lee, Jinseok
    Cho, Shung Han
    Oh, Seong-Jun
    Hong, Sangjin
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2012, 6 (05): : 1400 - 1420
  • [30] Learning Local-Global Representation for Scribble-Based RGB-D Salient Object Detection via Transformer
    Wang, Yue
    Zhang, Lu
    Zhang, Pingping
    Zhuge, Yunzhi
    Wu, Junfeng
    Yu, Hong
    Lu, Huchuan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11592 - 11604