Applying a Convolutional Vision Transformer for Emotion Recognition in Children with Autism: Fusion of Facial Expressions and Speech Features

被引:0
|
作者
Wang, Yonggu [1 ]
Pan, Kailin [1 ]
Shao, Yifan [1 ]
Ma, Jiarong [1 ]
Li, Xiaojuan [2 ]
机构
[1] Zhejiang Univ Technol, Coll Educ, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Finance & Econ, Mental Hlth Educ Ctr, Hangzhou 310018, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 06期
基金
中国国家自然科学基金;
关键词
emotion recognition; multimodal feature fusion; deep learning; children with autism;
D O I
10.3390/app15063083
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With advances in digital technology, including deep learning and big data analytics, new methods have been developed for autism diagnosis and intervention. Emotion recognition and the detection of autism in children are prominent subjects in autism research. Typically using single-modal data to analyze the emotional states of children with autism, previous research has found that the accuracy of recognition algorithms must be improved. Our study creates datasets on the facial and speech emotions of children with autism in their natural states. A convolutional vision transformer-based emotion recognition model is constructed for the two distinct datasets. The findings indicate that the model achieves accuracies of 79.12% and 83.47% for facial expression recognition and Mel spectrogram recognition, respectively. Consequently, we propose a multimodal data fusion strategy for emotion recognition and construct a feature fusion model based on an attention mechanism, which attains a recognition accuracy of 90.73%. Ultimately, by using gradient-weighted class activation mapping, a prediction heat map is produced to visualize facial expressions and speech features under four emotional states. This study offers a technical direction for the use of intelligent perception technology in the realm of special education and enriches the theory of emotional intelligence perception of children with autism.
引用
收藏
页数:35
相关论文
共 50 条
  • [21] Improvement of Speech Emotion Recognition by Deep Convolutional Neural Network and Speech Features
    Mohanty, Aniruddha
    Cherukuri, Ravindranath C.
    Prusty, Alok Ranjan
    THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 117 - 129
  • [22] Applying Generative Adversarial Networks and Vision Transformers in Speech Emotion Recognition
    Heracleous, Panikos
    Fukayama, Satoru
    Ogata, Jun
    Mohammad, Yasser
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13519 LNCS : 67 - 75
  • [23] The Nature of Facial Emotion Recognition Impairments in Children on the Autism Spectrum
    Nathaniel A. Shanok
    Nancy Aaron Jones
    Nikola N. Lucas
    Child Psychiatry & Human Development, 2019, 50 : 661 - 667
  • [24] The Nature of Facial Emotion Recognition Impairments in Children on the Autism Spectrum
    Shanok, Nathaniel A.
    Jones, Nancy Aaron
    Lucas, Nikola N.
    CHILD PSYCHIATRY & HUMAN DEVELOPMENT, 2019, 50 (04) : 661 - 667
  • [25] Recognition of schematic facial displays of emotion in parents of children with autism
    Palermo, Mark T.
    Pasqualetti, Patrizio
    Barbati, Giulia
    Intelligente, Fabio
    Rossini, Paolo Maria
    AUTISM, 2006, 10 (04) : 353 - 364
  • [26] MFGCN: Multimodal fusion graph convolutional network for speech emotion recognition
    Qi, Xin
    Wen, Yujun
    Zhang, Pengzhou
    Huang, Heyan
    NEUROCOMPUTING, 2025, 611
  • [27] Emotion understanding in maltreated children: Recognition of facial expressions and integration with other emotion cues
    Camras, LA
    SachsAlter, E
    Ribordy, SC
    EMOTIONAL DEVELOPMENT IN ATYPICAL CHILDREN, 1996, : 203 - 225
  • [28] Recognition of emotion in facial expressions and vocal tones in children with psychopathic tendencies
    Stevens, D
    Charman, T
    Blair, RJR
    JOURNAL OF GENETIC PSYCHOLOGY, 2001, 162 (02): : 201 - 211
  • [29] Emotion Recognition using Facial Expressions in Children using the NAO Robot
    Lopez-Rincon, Alejandro
    2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND COMPUTERS (CONIELECOMP), 2019, : 146 - 153
  • [30] Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network
    Alluhaidan, Ala Saleh
    Saidani, Oumaima
    Jahangir, Rashid
    Nauman, Muhammad Asif
    Neffati, Omnia Saidani
    APPLIED SCIENCES-BASEL, 2023, 13 (08):