Applying a Convolutional Vision Transformer for Emotion Recognition in Children with Autism: Fusion of Facial Expressions and Speech Features

被引:0
|
作者
Wang, Yonggu [1 ]
Pan, Kailin [1 ]
Shao, Yifan [1 ]
Ma, Jiarong [1 ]
Li, Xiaojuan [2 ]
机构
[1] Zhejiang Univ Technol, Coll Educ, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Finance & Econ, Mental Hlth Educ Ctr, Hangzhou 310018, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 06期
基金
中国国家自然科学基金;
关键词
emotion recognition; multimodal feature fusion; deep learning; children with autism;
D O I
10.3390/app15063083
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With advances in digital technology, including deep learning and big data analytics, new methods have been developed for autism diagnosis and intervention. Emotion recognition and the detection of autism in children are prominent subjects in autism research. Typically using single-modal data to analyze the emotional states of children with autism, previous research has found that the accuracy of recognition algorithms must be improved. Our study creates datasets on the facial and speech emotions of children with autism in their natural states. A convolutional vision transformer-based emotion recognition model is constructed for the two distinct datasets. The findings indicate that the model achieves accuracies of 79.12% and 83.47% for facial expression recognition and Mel spectrogram recognition, respectively. Consequently, we propose a multimodal data fusion strategy for emotion recognition and construct a feature fusion model based on an attention mechanism, which attains a recognition accuracy of 90.73%. Ultimately, by using gradient-weighted class activation mapping, a prediction heat map is produced to visualize facial expressions and speech features under four emotional states. This study offers a technical direction for the use of intelligent perception technology in the realm of special education and enriches the theory of emotional intelligence perception of children with autism.
引用
收藏
页数:35
相关论文
共 50 条
  • [1] Deep Learning-Based Emotion Recognition by Fusion of Facial Expressions and Speech Features
    Vardhan, Jasthi Vivek
    Chakravarti, Yelavarti Kalyan
    Chand, Annam Jitin
    2024 2ND WORLD CONFERENCE ON COMMUNICATION & COMPUTING, WCONF 2024, 2024,
  • [2] Attention to facial emotion expressions in children with autism
    Begeer, S
    Rieffe, C
    Terwogt, MM
    Stockmann, L
    AUTISM, 2006, 10 (01) : 37 - 51
  • [3] THE DEVELOPMENT OF THE RECOGNITION OF FACIAL EXPRESSIONS OF EMOTION IN CHILDREN
    GOSSELIN, P
    CANADIAN JOURNAL OF BEHAVIOURAL SCIENCE-REVUE CANADIENNE DES SCIENCES DU COMPORTEMENT, 1995, 27 (01): : 107 - 119
  • [4] GCFormer: A Graph Convolutional Transformer for Speech Emotion Recognition
    Gao, Yingxue
    Zhao, Huan
    Xiao, Yufeng
    Zhang, Zixing
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023, 2023, : 307 - 313
  • [5] Applying articulatory features to speech emotion recognition
    Zhou, Yu
    Sun, Yanqing
    Yang, Lin
    Yan, Yonghong
    2009 INTERNATIONAL CONFERENCE ON RESEARCH CHALLENGES IN COMPUTER SCIENCE, ICRCCS 2009, 2009, : 73 - 76
  • [6] Fusion of Facial Expressions and EEG for Multimodal Emotion Recognition
    Huang, Yongrui
    Yang, Jianhao
    Liao, Pengkai
    Pan, Jiahui
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2017, 2017
  • [7] An enhanced speech emotion recognition using vision transformer
    Akinpelu, Samson
    Viriri, Serestina
    Adegun, Adekanmi
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [8] Multimodal transformer augmented fusion for speech emotion recognition
    Wang, Yuanyuan
    Gu, Yu
    Yin, Yifei
    Han, Yingping
    Zhang, He
    Wang, Shuang
    Li, Chenyu
    Quan, Dou
    FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [9] Bimodal Approach in Emotion Recognition using Speech and Facial Expressions
    Emerich, Simina
    Lupu, Eugen
    Apatean, Anca
    ISSCS 2009: INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS, VOLS 1 AND 2, PROCEEDINGS,, 2009, : 297 - 300
  • [10] Multimodal Emotion Recognition Based on Facial Expressions, Speech, and EEG
    Pan, Jiahui
    Fang, Weijie
    Zhang, Zhihang
    Chen, Bingzhi
    Zhang, Zheng
    Wang, Shuihua
    IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2024, 5 : 396 - 403