HA-Transformer: Harmonious aggregation from local to global for object detection

被引：4

作者：

Chen, Yang ^{[1
]}

Chen, Sihan ^{[1
]}

Deng, Yongqiang ^{[2
]}

Wang, Kunfeng ^{[1
]}

机构：

[1] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing 100029, Peoples R China

[2] VanJee Technol, Beijing 100193, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2023年 / 230卷

基金：

中国国家自然科学基金;

关键词：

Object detection; Transformer; multi-head self-attention; global interaction; transition module;

D O I：

10.1016/j.eswa.2023.120539

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, the Vision Transformer (ViT) with global modeling capability has shown its excellent performance in classification task, which innovates the development direction for a series of vision tasks. However, due to the enormous cost of multi-head self-attention, reducing computational cost while holding the capability of global interaction remains a big challenge. In this paper, we propose a new architecture by establishing an end-to-end connection from local to global via bridge tokens, so that the global interaction is completed at the window level, effectively solving the quadratic complexity problem of transformer. Besides, we consider a hierarchy of information from short-distance to long-distance, which adds a transition module from local to global to make a more harmonious aggregation of information. Our proposed method is named HA-Transformer. The experimental results on COCO dataset show excellent performance of HA-Transformer for object detection, outperforming several state-of-the-art methods.

引用

页数：9

共 50 条

[41] Transformer-Based Global PointPillars 3D Object Detection Method
Zhang, Lin
Meng, Hua
Yan, Yunbing
Xu, Xiaowei
ELECTRONICS, 2023, 12 (14)
[42] PointDet++: an object detection framework based on human local features with transformer encoder
Yudi Tang
Bing Wang
Wangli He
Feng Qian
Neural Computing and Applications, 2023, 35 : 10097 - 10108
[43] Local to Global: A Sparse Transformer-Based Small Object Detector for Remote Sensing Images
Li, Zheng
Wang, Yongcheng
Feng, Hao
Chen, Chi
Xu, Dongdong
Zhao, Tianqi
Gao, Yunxiao
Zhao, Zhikang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[44] The contrasting impact of global and local object attributes on Kanizsa figure detection
Conci, Markus
Mueller, Hermann J.
Elliott, Mark A.
PERCEPTION & PSYCHOPHYSICS, 2007, 69 (08): : 1278 - 1294
[45] Unifying Global-Local Representations in Salient Object Detection With Transformers
Ren, Sucheng
Zhao, Nanxuan
Wen, Qiang
Han, Guoqiang
He, Shengfeng
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2870 - 2879
[46] Salient object detection by local and global manifold regularized SVM model
Zhang, Lihe
Zhang, Dandan
Sun, Jiayu
Wei, Guohua
Bo, Hongguang
NEUROCOMPUTING, 2019, 340 : 42 - 54
[47] Salient object detection using local, global and high contrast graphs
Nouri, Fatemeh
Kazemi, Kamran
Danyali, Habibollah
SIGNAL IMAGE AND VIDEO PROCESSING, 2018, 12 (04) : 659 - 667
[48] The contrasting impact of global and local object attributes on Kanizsa figure detection
Markus Conci
Hermann J. Müller
Mark A. Elliott
Perception & Psychophysics, 2007, 69 : 1278 - 1294
[49] Salient object detection based on global to local visual search guidance
Wu, Yangxi
Zhang, Dongbo
Yin, Feng
Zhang, Ying
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 102
[50] Local to global purification strategy to realize collaborative camouflaged object detection
Tong, Jinghui
Bi, Yaqiu
Zhang, Cong
Bi, Hongbo
Yuan, Ye
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 241

← 1 2 3 4 5 →