AESR: Speech Recognition With Speech Emotion Recogniting Learning

被引:0
|
作者
Han, RongQi [1 ]
Liu, Xin [1 ]
Zhang, Hui [1 ]
机构
[1] Inner Mongolia Univ, Coll Comp Sci, Hohhot, Peoples R China
关键词
Automatic Speech Recognition; Speech Emotion Recognition; Multi-task Learning; Character Error Rate; Word Error Rate;
D O I
10.1007/978-981-96-1045-7_8
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Modern Automatic Speech Recognition (ASR) systems aim to accurately convert spoken language into written text. However, they often face challenges when confronted with emotional speech, as traditional systems struggle to interpret the subtleties of emotional inflection. To overcome this challenge, a multi-task learning approach has been proposed that simultaneously addresses ASR and Speech Emotion Recognition (SER). With limited emotional speech resources, this approach has demonstrated improved recognition accuracy for the streaming ASR system when handling emotional utterances. Experiments conducted on both the MELD and SIMS datasets have shown a significant decrease in Word Error Rate (WER) and Character Error Rate(CER) when using the joint learning method compared to the optimized baseline. Specifically, the WER decreased by 1.27 on the MELD dataset and the CER by 0.58 on the SIMS dataset.
引用
收藏
页码:91 / 101
页数:11
相关论文
共 50 条
  • [21] Discriminative Feature Learning for Speech Emotion Recognition
    Zhang, Yuying
    Zou, Yuexian
    Peng, Junyi
    Luo, Danqing
    Huang, Dongyan
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: TEXT AND TIME SERIES, PT IV, 2019, 11730 : 198 - 210
  • [22] Speech Emotion Recognition Using Deep Learning
    Alagusundari, N.
    Anuradha, R.
    ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
  • [23] Speech Emotion Recognition Using Deep Learning
    Ahmed, Waqar
    Riaz, Sana
    Iftikhar, Khunsa
    Konur, Savas
    ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197
  • [24] Active Learning for Dimensional Speech Emotion Recognition
    Han, Wenjing
    Li, Haifeng
    Ruan, Huabin
    Ma, Lin
    Sun, Jiayin
    Schuller, Bjoern
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2840 - 2844
  • [25] Speech Emotion Recognition Using Transfer Learning
    Song, Peng
    Jin, Yun
    Zhao, Li
    Xin, Minghai
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (09): : 2530 - 2532
  • [26] Speech Emotion Recognition
    Lalitha, S.
    Madhavan, Abhishek
    Bhushan, Bharath
    Saketh, Srinivas
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRONICS, COMPUTERS AND COMMUNICATIONS (ICAECC), 2014,
  • [27] Speech emotion recognition of Hindi speech using statistical and machine learning techniques
    Agrawal, Akshat
    Jain, Anurag
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2020, 23 (01) : 311 - 319
  • [28] English speech emotion recognition method based on speech recognition
    Liu, Man
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (2) : 391 - 398
  • [29] English speech emotion recognition method based on speech recognition
    Man Liu
    International Journal of Speech Technology, 2022, 25 : 391 - 398
  • [30] Ensemble deep learning with HuBERT for speech emotion recognition
    Yang, Janghoon
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 153 - 154