Influence of Emotional Speech on Continuous Speech Recognition

被引:0
|
作者
Zgank, Andrej [1 ]
Maucec, Mirjam Sepesy [1 ]
机构
[1] Univ Maribor, Fac Elect Engn & Comp Sci, Maribor, Slovenia
关键词
speech recognition; emotional speech; highly inflected language; Human-Computer Interaction; CLASSIFICATION; FEATURES;
D O I
10.1109/elektro49696.2020.9130316
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Emotions are an important part of human communication, but they can present harsh conditions for an automatic continuous speech recognition system. This paper presents an analysis of to which level the emotional speech degrades speech recognition accuracy, when dealing with a highly inflected Slovenian language. Namely, the language characteristics are those that also influence the speech recognition performance, and inflection is one of the most challenging ones. Moreover, Slovenian belongs to the group of under-resourced languages, like other Slavic languages. The speech recognition system was developed with the Slovenian BNSI Broadcast News speech database. The Interface speech database was used for the experiments with the emotional speech. The analysis was carried out with HMM and DNN acoustic models, combined with a 3-gram statistical language model. The results show that emotional speech degrades speech recognition accuracy in the range between 5% and 7% absolutely.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] CONTINUOUS SPEECH RECOGNITION
    MORGAN, N
    BOURLARD, H
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 1995, 12 (03) : 25 - 42
  • [2] Speech Emotion Recognition Based on Gender Influence in Emotional Expression
    Vasuki, P.
    Bharati, Divya R.
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2019, 15 (04) : 22 - 40
  • [3] CONTINUOUS VISUAL SPEECH RECOGNITION FOR AUDIO SPEECH ENHANCEMENT
    Benhaim, Eric
    Sahbi, Hichem
    Vitte, Guillaume
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2244 - 2248
  • [4] Continuous speech recognition for clinicians
    Zafar, A
    Overhage, JM
    McDonald, CJ
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1999, 6 (03) : 195 - 204
  • [5] COMPUTER RECOGNITION OF CONTINUOUS SPEECH
    PURVES, RB
    STRONG, WJ
    [J]. ACUSTICA, 1976, 35 (02): : 111 - 121
  • [6] PRACTICAL AND CONTINUOUS SPEECH RECOGNITION
    ROSS, S
    MACALLISTER, J
    [J]. COMPUTER DESIGN, 1984, 23 (07): : 69 - &
  • [7] WORD RECOGNITION IN CONTINUOUS SPEECH
    TABOSSI, P
    SCOTT, D
    BURANI, C
    [J]. BULLETIN OF THE PSYCHONOMIC SOCIETY, 1991, 29 (06) : 529 - 529
  • [8] Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech
    Wang, Shijun
    Gudnason, Jon
    Borth, Damian
    [J]. INTERSPEECH 2023, 2023, : 351 - 355
  • [9] Generative emotional AI for speech emotion recognition: The case for synthetic emotional speech augmentation
    Latif, Siddique
    Shahid, Abdullah
    Qadir, Junaid
    [J]. APPLIED ACOUSTICS, 2023, 210
  • [10] Recognition of Emotional States in Natural Speech
    Kaminska, Dorota
    Sapinski, Tomasz
    Pelikant, Adam
    [J]. 2013 SIGNAL PROCESSING SYMPOSIUM (SPS), 2013,