Cnnformer: Transformer-Based Semantic Information Enhancement Framework for Behavior Recognition

被引:0
|
作者
Liu, Jindong [1 ,2 ]
Xiao, Zidong [1 ]
Bai, Yan [3 ]
Xie, Fei [3 ]
Wu, Wei [3 ]
Zhu, Wenjuan [1 ]
He, Hua [2 ]
机构
[1] Northwest Univ, Sch Informat & Technol, Xian 710127, Peoples R China
[2] Northwest Univ, Sch Foreign Languages, Xian 710127, Peoples R China
[3] Xijing Univ, Sch Comp Sci, Xian Key Lab Human Machine Integrat & Control Tech, Xian 710123, Peoples R China
来源
IEEE ACCESS | 2023年 / 11卷
基金
中国国家自然科学基金;
关键词
Behavior recognition; transformer; convolutional neural networks; semantic information; dilated convolution;
D O I
10.1109/ACCESS.2023.3342076
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Behavior recognition is a vital task in computer vision. While, semantic information extraction is still insufficient in behavior recognition models. In this paper, we propose an improved behavior recognition model, which is called Cnnformer, to alleviate the problem of inadequate semantic information extraction. Cnnformer is transformer-based semantic information enhancement model for behavior recognition. In Cnnformer, a new attention mechanism is designed and introduced into the encoder module. This attention mechanism uses dilated convolution to capture static context information, trigger mining dynamic context information, and obtain the final fused dynamic and static context information. In addition, four layers of convolution are added in front of the encoder module, which has a strong induction bias to extract the superficial feature representation (such as color, geometry, texture, etc.). Finally, Cnnformer combines the convolution module and the attention module into the encoder module to simultaneously learn both local and global features, so as to enhance visual representation. Experimental results show that Cnnformer has higher performance in behavior recognition, and the accuracy of Top-1 is 3.4% higher than that of the basic model in the Kinetics-400 dataset.
引用
收藏
页码:141299 / 141308
页数:10
相关论文
共 50 条
  • [1] A Transformer-Based Framework for Scene Text Recognition
    Selvam, Prabu
    Koilraj, Joseph Abraham Sundar
    Tavera Romero, Carlos Andres
    Alharbi, Meshal
    Mehbodniya, Abolfazl
    Webber, Julian L.
    Sengan, Sudhakar
    IEEE ACCESS, 2022, 10 : 100895 - 100910
  • [2] A Transformer-Based Framework for Biomedical Information Retrieval Systems
    Hall, Karl
    Jayne, Chrisina
    Chang, Victor
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 317 - 331
  • [3] ERTNet: an interpretable transformer-based framework for EEG emotion recognition
    Liu, Ruixiang
    Chao, Yihu
    Ma, Xuerui
    Sha, Xianzheng
    Sun, Limin
    Li, Shuo
    Chang, Shijie
    FRONTIERS IN NEUROSCIENCE, 2024, 18
  • [4] A novel transformer-based semantic segmentation framework for structural condition assessment
    Wang, Ruhua
    Shao, Yanda
    Li, Qilin
    Li, Ling
    Li, Jun
    Hao, Hong
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2024, 23 (02): : 1170 - 1183
  • [5] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
    Duan, Zaipeng
    Huang, Xiao
    Ma, Jie
    NEURAL PROCESSING LETTERS, 2023, 55 (05) : 6361 - 6375
  • [6] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
    Zaipeng Duan
    Xiao Huang
    Jie Ma
    Neural Processing Letters, 2023, 55 : 6361 - 6375
  • [7] A transformer-based network for speech recognition
    Tang L.
    International Journal of Speech Technology, 2023, 26 (02) : 531 - 539
  • [8] A Transformer-Based Unsupervised Domain Adaptation Method for Skeleton Behavior Recognition
    Yan, Qiuyan
    Hu, Yan
    IEEE ACCESS, 2023, 11 : 51689 - 51700
  • [9] TransRSS: Transformer-based Radar Semantic Segmentation
    Zou, Hao
    Xie, Zhen
    Ou, Jiarong
    Gao, Yutao
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 6965 - 6972
  • [10] BertSRC: transformer-based semantic relation classification
    Lee, Yeawon
    Son, Jinseok
    Song, Min
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 22 (01)