A Transformer-based Late-Fusion Mechanism for Fine-Grained Object Recognition in Videos

被引：3

作者：

Koch, Jannik ^{[1
]}

Wolf, Stefan ^{[1
,2
]}

Beyerer, Juergen ^{[1
,2
,3
]}

机构：

[1] Fraunhofer IOSB, Karlsruhe, Germany

[2] Karlsruhe Inst Technol, Vis & Future Lab, Karlsruhe, Germany

[3] Fraunhofer Ctr Machine Learning, Munich, Germany

来源：

2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW) | 2023年

关键词：

D O I：

10.1109/WACVW58289.2023.00015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Fine-grained image classification is limited by only considering a single view while in many cases, like surveillance, a whole video exists which provides multiple perspectives. However, the potential of videos is mostly considered in the context of action recognition while finegrained object recognition is rarely considered as an application for video classification. This leads to recent video classification architectures being inappropriate for the task of fine-grained object recognition. We propose a novel, Transformer-based late-fusion mechanism for finegrained video classification. Our approach achieves superior results to both early-fusion mechanisms, like the Video Swin Transformer, and a simple consensus-based late-fusion baseline with a modern Swin Transformer backbone. Additionally, we achieve improved efficiency, as our results show a high increase in accuracy with only a slight increase in computational complexity. Code is available at: https://github.com/wolfstefan/tlf.

引用

页码：100 / 109

页数：10

共 50 条

[41] TECMH: Transformer-Based Cross-Modal Hashing For Fine-Grained Image-Text Retrieval
Li, Qiqi
Ma, Longfei
Jiang, Zheng
Li, Mingyong
Jin, Bo
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 3713 - 3728
[42] Loop and distillation: Attention weights fusion transformer for fine-grained representation
Fayou, Sun
Ngo, Hea Choon
Meng, Zuqiang
Sek, Yong Wee
IET COMPUTER VISION, 2023, 17 (04) : 473 - 482
[43] Fine-Grained Classification of Wild Mushrooms Based on Feature Fusion and Attention Mechanism
Qian Jiaxin
Yu Pengfei
Li Haiyan
Li Hongsong
LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (04)
[44] Siamese transformer with hierarchical concept embedding for fine-grained image recognition
Yilin LYU
Liping JING
Jiaqi WANG
Mingzhe GUO
Xinyue WANG
Jian YU
Science China(Information Sciences), 2023, 66 (03) : 188 - 203
[45] Siamese transformer with hierarchical concept embedding for fine-grained image recognition
Lyu, Yilin
Jing, Liping
Wang, Jiaqi
Guo, Mingzhe
Wang, Xinyue
Yu, Jian
SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (03)
[46] Transformer with peak suppression and knowledge guidance for fine-grained image recognition
Liu, Xinda
Wang, Lili
Han, Xiaoguang
NEUROCOMPUTING, 2022, 492 : 137 - 149
[47] Siamese transformer with hierarchical concept embedding for fine-grained image recognition
Yilin Lyu
Liping Jing
Jiaqi Wang
Mingzhe Guo
Xinyue Wang
Jian Yu
Science China Information Sciences, 2023, 66
[48] Fine-Grained Radio Frequency Fingerprint Recognition Network Based on Attention Mechanism
Zhang, Yulan
Hu, Jun
Jiang, Rundong
Lin, Zengrong
Chen, Zengping
ENTROPY, 2024, 26 (01)
[49] Fine-grained Recognition of Chinese Food Image Based on DenseNet with Attention Mechanism
Hao, Ran
Gao, Weidong
Mi, Jihang
Zhao, Zhenwei
TWELFTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2020), 2021, 11720
[50] Noun-based attention mechanism for Fine-grained Named Entity Recognition
Rodriguez, Alejandro Jesus Castaneira
Castro, Daniel Castro
Herold Garcia, Silena
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193

← 1 2 3 4 5 →