Swin-Fusion: Swin-Transformer with Feature Fusion for Human Action Recognition

被引:10
|
作者
Chen, Tiansheng [1 ]
Mo, Lingfei [1 ]
机构
[1] Southeast Univ, Sch Instrument Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Swin-Transformer; Feature pyramid; Image classification; NETWORK;
D O I
10.1007/s11063-023-11367-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humanaction recognition based on still images is one of themost challenging computer vision tasks. In the past decade, convolutional neural networks (CNNs) have developed rapidly and achieved good performance in human action recognition tasks based on still images. Due to the absence of the remote perception ability of CNNs, it is challenging to have a global structural understanding of human behavior and the overall relationship between the behavior and the environment. Recently, transformer-based models have been making a splash in computer vision, even reaching SOTA in several vision tasks. We explore the transformer's capability in human action recognition based on still images and add a simple but effective feature fusion module based on the Swin-Transformer model. More specifically, we propose a newtransformer-basedmodel for behavioral feature extraction that uses a pre-trained SwinTransformer as the backbone network. Swin-Transformer's distinctive hierarchical structure, combined with the feature fusion module, is used to extract and fuse multi-scale behavioral information. Extensive experiments were conducted on five still image-based human action recognition datasets, including the Li's action dataset, the Stanford-40 dataset, the PPMI-24 dataset, the AUC-V1 dataset, and the AUC-V2 dataset. Results indicate that our proposed Swin-Fusion model achieves better behavior recognition than previously improved CNNbased models by sharing and reusing feature maps of different scales at multiple stages, without modifying the original backbone training method and with only increasing training resources by 1.6%. The code and models will be available at https://github.com/ cts4444/ Swin-Fusion.
引用
收藏
页码:11109 / 11130
页数:22
相关论文
共 50 条
  • [1] Swin-Fusion: Swin-Transformer with Feature Fusion for Human Action Recognition
    Tiansheng Chen
    Lingfei Mo
    Neural Processing Letters, 2023, 55 : 11109 - 11130
  • [2] Swin-transformer for weak feature matching
    Guo, Yuan
    Li, Wenpeng
    Zhai, Ping
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [3] An end-to-end medical image fusion network based on Swin-transformer
    Yu, Kaixin
    Yang, Xiaoming
    Jeon, Seunggil
    Dou, Qingyu
    MICROPROCESSORS AND MICROSYSTEMS, 2023, 98
  • [4] Swin-MFA: A Multi-Modal Fusion Attention Network Based on Swin-Transformer for Low-Light Image Human Segmentation
    Yi, Xunpeng
    Zhang, Haonan
    Wang, Yibo
    Guo, Shujiang
    Wu, Jingyi
    Fan, Cien
    SENSORS, 2022, 22 (16)
  • [5] Swin-APT: An Enhancing Swin-Transformer Adaptor for Intelligent Transportation
    Liu, Yunzhuo
    Wu, Chunjiang
    Zeng, Yuting
    Chen, Keyu
    Zhou, Shijie
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [6] Short wave protocol signals recognition based on Swin-Transformer
    Zhu Z.
    Chen P.
    Wang Z.
    Gong K.
    Wu D.
    Wang Z.
    Tongxin Xuebao/Journal on Communications, 2022, 43 (11): : 127 - 135
  • [7] SlowFast Multimodality Compensation Fusion Swin Transformer Networks for RGB-D Action Recognition
    Xiao, Xiongjiang
    Ren, Ziliang
    Li, Huan
    Wei, Wenhong
    Yang, Zhiyong
    Yang, Huaide
    MATHEMATICS, 2023, 11 (09)
  • [8] Swin transformer and fusion for underwater image enhancement
    Sun, Jinghao
    Dong, Junyu
    Lv, Qingxuan
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2022, 2022, 12177
  • [9] Swimtrans Net: a multimodal robotic system for swimming action recognition driven via Swin-Transformer
    Chen, He
    Yue, Xiaoyu
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [10] CSTFNet: A CNN and Dual Swin-Transformer Fusion Network for Remote Sensing Hyperspectral Data Fusion and Classification of Coastal Areas
    Li, Dekai
    Neira-Molina, Harold
    Huang, Mengxing
    Syam, M. S.
    Yu, Zhang
    Zhang, Junfeng
    Bhatti, Uzair Aslam
    Asif, Muhammad
    Sarhan, Nadia
    Awwad, Emad Mahrous
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 5853 - 5865