MatchFormer: Interleaving Attention in Transformers for Feature Matching

被引:23
|
作者
Wang, Qing [1 ]
Zhang, Jiaming [1 ]
Yang, Kailun [1 ]
Peng, Kunyu [1 ]
Stiefelhagen, Rainer [1 ]
机构
[1] Karlsruhe Inst Technol, Karlsruhe, Germany
来源
关键词
Feature matching; Vision transformers;
D O I
10.1007/978-3-031-26313-2_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Local feature matching is a computationally intensive task at the subpixel level. While detector-based methods coupled with feature descriptors struggle in low-texture scenes, CNN-based methods with a sequential extract-to-match pipeline, fail to make use of the matching capacity of the encoder and tend to overburden the decoder for matching. In contrast, we propose a novel hierarchical extract-and-match transformer, termed as MatchFormer. Inside each stage of the hierarchical encoder, we interleave self-attention for feature extraction and cross-attention for feature matching, yielding a human-intuitive extract-and-match scheme. Such a match-aware encoder releases the overloaded decoder and makes the model highly efficient. Further, combining self- and cross-attention on multi-scale features in a hierarchical architecture improves matching robustness, particularly in low-texture indoor scenes or with less outdoor training data. Thanks to such a strategy, MatchFormer is a multi-win solution in efficiency, robustness, and precision. Compared to the previous best method in indoor pose estimation, our lite MatchFormer has only 45% GFLOPs, yet achieves a +1.3% precision gain and a 41% running speed boost. The large MatchFormer reaches state-of-the-art on four different benchmarks, including indoor pose estimation (ScanNet), outdoor pose estimation (MegaDepth), homography estimation and image matching (HPatch), and visual localization (InLoc).
引用
收藏
页码:256 / 273
页数:18
相关论文
共 50 条
  • [21] AFMtrack: Attention-Based Feature Matching for Multiple Object Tracking
    Cuong Bui, Duy
    Anh Hoang, Hiep
    Yoo, Myungsik
    IEEE ACCESS, 2024, 12 : 82897 - 82910
  • [22] FilterGNN: Image feature matching with cascaded outlier filters and linear attention
    Cai, Jun-Xiong
    Mu, Tai-Jiang
    Lai, Yu-Kun
    COMPUTATIONAL VISUAL MEDIA, 2024, 10 (05) : 873 - 884
  • [23] Multi-dimensional Attention Feature Aggregation Stereo Matching Algorithm
    Zhang Y.-R.
    Kong Y.-T.
    Liu B.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (07): : 1805 - 1815
  • [24] A Hierarchical Consensus Attention Network for Feature Matching of Remote Sensing Images
    Chen, Shuang
    Chen, Jiaxuan
    Rao, Yujing
    Chen, Xiaoxian
    Fan, Xiaoyan
    Bai, Haicheng
    Xing, Lin
    Zhou, Chengjiang
    Yang, Yang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [25] Improving sparse graph attention for feature matching by informative keypoints exploration
    Jiang, Xingyu
    Zhang, Shihua
    Zhang, Xiao-Ping
    Ma, Jiayi
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 235
  • [26] INTERLEAVING FORWARD BACKWARD FEATURE SELECTION
    Siebers, Michael
    Schmid, Ute
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 454 - 457
  • [27] Interleaving 3D model feature prediction and matching to support multi-sensor object recognition
    Stevens, MR
    Beveridge, JR
    IMAGE UNDERSTANDING WORKSHOP, 1996 PROCEEDINGS, VOLS I AND II, 1996, : 699 - 706
  • [28] Semi-Dense Feature Matching With Transformers and its Applications in Multiple-View Geometry
    Shen, Zehong
    Sun, Jiaming
    Wang, Yuang
    He, Xingyi
    Bao, Hujun
    Zhou, Xiaowei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7726 - 7738
  • [29] The reversed compatibility effect: distractors matching the response feature but not the selection feature capture attention and evoke suppression
    Zheng Wei
    Mengnuo Dai
    Feng Du
    Current Psychology, 2024, 43 : 3341 - 3354
  • [30] The reversed compatibility effect: distractors matching the response feature but not the selection feature capture attention and evoke suppression
    Wei, Zheng
    Dai, Mengnuo
    Du, Feng
    CURRENT PSYCHOLOGY, 2024, 43 (04) : 3341 - 3354