A Transformer-based Late-Fusion Mechanism for Fine-Grained Object Recognition in Videos

被引:3
|
作者
Koch, Jannik [1 ]
Wolf, Stefan [1 ,2 ]
Beyerer, Juergen [1 ,2 ,3 ]
机构
[1] Fraunhofer IOSB, Karlsruhe, Germany
[2] Karlsruhe Inst Technol, Vis & Future Lab, Karlsruhe, Germany
[3] Fraunhofer Ctr Machine Learning, Munich, Germany
关键词
D O I
10.1109/WACVW58289.2023.00015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained image classification is limited by only considering a single view while in many cases, like surveillance, a whole video exists which provides multiple perspectives. However, the potential of videos is mostly considered in the context of action recognition while finegrained object recognition is rarely considered as an application for video classification. This leads to recent video classification architectures being inappropriate for the task of fine-grained object recognition. We propose a novel, Transformer-based late-fusion mechanism for finegrained video classification. Our approach achieves superior results to both early-fusion mechanisms, like the Video Swin Transformer, and a simple consensus-based late-fusion baseline with a modern Swin Transformer backbone. Additionally, we achieve improved efficiency, as our results show a high increase in accuracy with only a slight increase in computational complexity. Code is available at: https://github.com/wolfstefan/tlf.
引用
收藏
页码:100 / 109
页数:10
相关论文
共 50 条
  • [41] TECMH: Transformer-Based Cross-Modal Hashing For Fine-Grained Image-Text Retrieval
    Li, Qiqi
    Ma, Longfei
    Jiang, Zheng
    Li, Mingyong
    Jin, Bo
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 3713 - 3728
  • [42] Loop and distillation: Attention weights fusion transformer for fine-grained representation
    Fayou, Sun
    Ngo, Hea Choon
    Meng, Zuqiang
    Sek, Yong Wee
    IET COMPUTER VISION, 2023, 17 (04) : 473 - 482
  • [43] Fine-Grained Classification of Wild Mushrooms Based on Feature Fusion and Attention Mechanism
    Qian Jiaxin
    Yu Pengfei
    Li Haiyan
    Li Hongsong
    LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (04)
  • [44] Siamese transformer with hierarchical concept embedding for fine-grained image recognition
    Yilin LYU
    Liping JING
    Jiaqi WANG
    Mingzhe GUO
    Xinyue WANG
    Jian YU
    Science China(Information Sciences), 2023, 66 (03) : 188 - 203
  • [45] Siamese transformer with hierarchical concept embedding for fine-grained image recognition
    Lyu, Yilin
    Jing, Liping
    Wang, Jiaqi
    Guo, Mingzhe
    Wang, Xinyue
    Yu, Jian
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (03)
  • [46] Transformer with peak suppression and knowledge guidance for fine-grained image recognition
    Liu, Xinda
    Wang, Lili
    Han, Xiaoguang
    NEUROCOMPUTING, 2022, 492 : 137 - 149
  • [47] Siamese transformer with hierarchical concept embedding for fine-grained image recognition
    Yilin Lyu
    Liping Jing
    Jiaqi Wang
    Mingzhe Guo
    Xinyue Wang
    Jian Yu
    Science China Information Sciences, 2023, 66
  • [48] Fine-Grained Radio Frequency Fingerprint Recognition Network Based on Attention Mechanism
    Zhang, Yulan
    Hu, Jun
    Jiang, Rundong
    Lin, Zengrong
    Chen, Zengping
    ENTROPY, 2024, 26 (01)
  • [49] Fine-grained Recognition of Chinese Food Image Based on DenseNet with Attention Mechanism
    Hao, Ran
    Gao, Weidong
    Mi, Jihang
    Zhao, Zhenwei
    TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
  • [50] Noun-based attention mechanism for Fine-grained Named Entity Recognition
    Rodriguez, Alejandro Jesus Castaneira
    Castro, Daniel Castro
    Herold Garcia, Silena
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193