Engagement Recognition in Online Learning Based on an Improved Video Vision Transformer

被引:1
|
作者
Guo, Zijian [1 ]
Zhou, Zhuoyi [1 ]
Pan, Jiahui [1 ]
Liang, Yan [1 ]
机构
[1] South China Normal Univ, Sch Software, Guangzhou, Peoples R China
关键词
D O I
10.1109/IJCNN54540.2023.10191579
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online learning has gained wide attention and application due to its flexibility and convenience. However, due to the separation of time and space, the level of students' engagement is not easily informed by teachers, which affects the effectiveness of teaching. Automatic detection of students' engagement is an effective way to solve this problem. It can help teachers obtain timely feedback from students and adjust the teaching schedule. In this paper, transformer is first applied in engagement recognition and a novel network based on an improved video vision transformer (ViViT) is proposed to detect student engagement. A new transformer encoder, named Transformer Encoder with Low Complexity (TELC) is proposed. It adopts unit force operated attention (UFO-attention) to eliminate the nonlinearity of the original self-attention in standard ViViT and Patch Merger to fuse the input patches, which allows the network to significantly reduce computational complexity while improving performance. The proposed method is evaluated on the Dataset for Affective States in E-learning Environments (DAiSEE) and achieves an accuracy of 63.91% in the four-level classification task, which is superior to state-of-the-art methods. The experimental results demonstrate the effectiveness of our method, which is more suitable for the practical application of online learning.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Video captioning based on vision transformer and reinforcement learning
    Zhao, Hong
    Chen, Zhiwen
    Guo, Lan
    Han, Zeyu
    [J]. PeerJ Computer Science, 2022, 8
  • [2] Video captioning based on vision transformer and reinforcement learning
    Zhao, Hong
    Chen, Zhiwen
    Guo, Lan
    Han, Zeyu
    [J]. PEERJ COMPUTER SCIENCE, 2022, 8
  • [3] A Video Face Recognition Leveraging Temporal Information Based on Vision Transformer
    Zhang, Hui
    Yang, Jiewen
    Dong, Xingbo
    Lv, Xingguo
    Jia, Wei
    Jin, Zhe
    Li, Xuejun
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V, 2024, 14429 : 29 - 43
  • [4] Online Continual Learning with Contrastive Vision Transformer
    Wang, Zhen
    Liu, Liu
    Kong, Yajing
    Guo, Jiaxian
    Tao, Dacheng
    [J]. COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 631 - 650
  • [5] RESEARCH ON IMAGE RECOGNITION OF ETHNIC MINORITY CLOTHING BASED ON IMPROVED VISION TRANSFORMER
    Wang, Taishen
    Wen, Bin
    [J]. MATHEMATICAL FOUNDATIONS OF COMPUTING, 2024, 7 (01): : 84 - 97
  • [6] Emotion Recognition of College Students' Online Learning Engagement Based on Deep Learning
    Wang, Chunyan
    [J]. INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2022, 17 (06) : 110 - 122
  • [7] An improved Vision Transformer model for the recognition of blood cells
    Sun, Tianyu
    Zhu, Qingtao
    Yang, Jian
    Zeng, Liang
    [J]. Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2022, 39 (06): : 1097 - 1107
  • [8] k-NN attention-based video vision transformer for action recognition
    Sun, Weirong
    Ma, Yujun
    Wang, Ruili
    [J]. NEUROCOMPUTING, 2024, 574
  • [9] Improved Deepfake Video Detection Using Convolutional Vision Transformer
    Deressa, Deressa Wodajo
    Lambert, Peter
    Van Wallendael, Glenn
    Atnafu, Solomon
    Mareen, Hannes
    [J]. 2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024, 2024, : 492 - 497
  • [10] Evaluation of Students' Learning Engagement in Online Classes Based on Multimodal Vision Perspective
    Qi, Yongfeng
    Zhuang, Liqiang
    Chen, Huili
    Han, Xiang
    Liang, Anye
    [J]. ELECTRONICS, 2024, 13 (01)