Cnnformer: Transformer-Based Semantic Information Enhancement Framework for Behavior Recognition

被引:0
|
作者
Liu, Jindong [1 ,2 ]
Xiao, Zidong [1 ]
Bai, Yan [3 ]
Xie, Fei [3 ]
Wu, Wei [3 ]
Zhu, Wenjuan [1 ]
He, Hua [2 ]
机构
[1] Northwest Univ, Sch Informat & Technol, Xian 710127, Peoples R China
[2] Northwest Univ, Sch Foreign Languages, Xian 710127, Peoples R China
[3] Xijing Univ, Sch Comp Sci, Xian Key Lab Human Machine Integrat & Control Tech, Xian 710123, Peoples R China
来源
IEEE ACCESS | 2023年 / 11卷
基金
中国国家自然科学基金;
关键词
Behavior recognition; transformer; convolutional neural networks; semantic information; dilated convolution;
D O I
10.1109/ACCESS.2023.3342076
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Behavior recognition is a vital task in computer vision. While, semantic information extraction is still insufficient in behavior recognition models. In this paper, we propose an improved behavior recognition model, which is called Cnnformer, to alleviate the problem of inadequate semantic information extraction. Cnnformer is transformer-based semantic information enhancement model for behavior recognition. In Cnnformer, a new attention mechanism is designed and introduced into the encoder module. This attention mechanism uses dilated convolution to capture static context information, trigger mining dynamic context information, and obtain the final fused dynamic and static context information. In addition, four layers of convolution are added in front of the encoder module, which has a strong induction bias to extract the superficial feature representation (such as color, geometry, texture, etc.). Finally, Cnnformer combines the convolution module and the attention module into the encoder module to simultaneously learn both local and global features, so as to enhance visual representation. Experimental results show that Cnnformer has higher performance in behavior recognition, and the accuracy of Top-1 is 3.4% higher than that of the basic model in the Kinetics-400 dataset.
引用
收藏
页码:141299 / 141308
页数:10
相关论文
共 50 条
  • [21] Semantic Segmentation of UAV Images Based on Transformer Framework with Context Information
    Kumar, Satyawant
    Kumar, Abhishek
    Lee, Dong-Gyu
    MATHEMATICS, 2022, 10 (24)
  • [22] A transformer-based framework for enterprise sales forecasting
    Sun, Yupeng
    Li, Tian
    PeerJ Computer Science, 2024, 10 : 1 - 14
  • [23] Image Alone Are Not Enough: A General Semantic-Augmented Transformer-Based Framework for Image Captioning
    Liu, Jiawei
    Lin, Xin
    He, Liang
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [24] A Transformer-Based Contrastive Semi-Supervised Learning Framework for Automatic Modulation Recognition
    Kong, Weisi
    Jiao, Xun
    Xu, Yuhua
    Zhang, Bolin
    Yang, Qinghai
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2023, 9 (04) : 950 - 962
  • [25] Fastformer: Transformer-Based Fast Reasoning Framework
    Zhu, Wenjuan
    Guo, Ling
    Zhang, Tianxiang
    Han, Feng
    Wei, Yi
    Gong, Xiaoqing
    Xu, Pengfei
    Guo, Jing
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [26] Transformer-based Summarization by Exploiting Social Information
    Minh-Tien Nguyen
    Van-Chien Nguyen
    Huy-The Vu
    Van-Hau Nguyen
    2020 12TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (IEEE KSE 2020), 2020, : 25 - 30
  • [27] RM-Transformer: A Transformer-based Model for Mandarin Speech Recognition
    Lu, Xingyu
    Hu, Jianguo
    Li, Shenhao
    Ding, Yanyu
    2022 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND ARTIFICIAL INTELLIGENCE (CCAI 2022), 2022, : 194 - 198
  • [28] Traffic Transformer: Transformer-based framework for temporal traffic accident prediction
    Al-Thani, Mansoor G.
    Sheng, Ziyu
    Cao, Yuting
    Yang, Yin
    AIMS MATHEMATICS, 2024, 9 (05): : 12610 - 12629
  • [29] Classification of hyperspectral and LiDAR data by transformer-based enhancement
    Pan, Jiechen
    Shuai, Xing
    Xu, Qing
    Dai, Mofan
    Zhang, Guoping
    Wang, Guo
    REMOTE SENSING LETTERS, 2024, 15 (10) : 1074 - 1084
  • [30] A Transformer-based Semantic Segmentation Model for Street Fashion Images
    Peng, Dingjie
    Kameyama, Wataru
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2023, 2023, 12592