Influence of Emotional Speech on Continuous Speech Recognition

被引:0
|
作者
Zgank, Andrej [1 ]
Maucec, Mirjam Sepesy [1 ]
机构
[1] Univ Maribor, Fac Elect Engn & Comp Sci, Maribor, Slovenia
关键词
speech recognition; emotional speech; highly inflected language; Human-Computer Interaction; CLASSIFICATION; FEATURES;
D O I
10.1109/elektro49696.2020.9130316
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Emotions are an important part of human communication, but they can present harsh conditions for an automatic continuous speech recognition system. This paper presents an analysis of to which level the emotional speech degrades speech recognition accuracy, when dealing with a highly inflected Slovenian language. Namely, the language characteristics are those that also influence the speech recognition performance, and inflection is one of the most challenging ones. Moreover, Slovenian belongs to the group of under-resourced languages, like other Slavic languages. The speech recognition system was developed with the Slovenian BNSI Broadcast News speech database. The Interface speech database was used for the experiments with the emotional speech. The analysis was carried out with HMM and DNN acoustic models, combined with a 3-gram statistical language model. The results show that emotional speech degrades speech recognition accuracy in the range between 5% and 7% absolutely.
引用
收藏
页数:4
相关论文
共 50 条
  • [11] Dimensionality Reduction for Emotional Speech Recognition
    Fewzee, Pouria
    Karray, Fakhri
    [J]. PROCEEDINGS OF 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK AND TRUST AND 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM/PASSAT 2012), 2012, : 532 - 537
  • [12] Deep Learning for Emotional Speech Recognition
    Sanchez-Gutierrez, Maximo E.
    Marcelo Albornoz, E.
    Martinez-Licona, Fabiola
    Leonardo Rufiner, H.
    Goddard, John
    [J]. PATTERN RECOGNITION, MCPR 2014, 2014, 8495 : 311 - +
  • [13] Deep Learning for Emotional Speech Recognition
    Alhamada, M., I
    Khalifa, O. O.
    Abdalla, A. H.
    [J]. PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC DEVICES, SYSTEMS AND APPLICATIONS (ICEDSA2020), 2020, 2306
  • [14] New Features for Emotional Speech Recognition
    Palo, Hemanta Kumar
    Mohanty, Mihir Narayan
    Chandra, Mahesh
    [J]. 2015 IEEE POWER, COMMUNICATION AND INFORMATION TECHNOLOGY CONFERENCE (PCITC-2015), 2015, : 424 - 429
  • [15] Improved Emotional Speech Recognition Algorithms
    Rajeswari, A.
    Sowmbika, P.
    Kalaimagal, P.
    Ramya, M.
    Ranjitha, M.
    [J]. PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 2362 - 2366
  • [16] Emotional Speech Recognition: A Multilingual Perspective
    Meftah, Ali
    Alotaibi, Yousef
    Selouani, Sid-Ahmed
    [J]. 2016 INTERNATIONAL CONFERENCE ON BIO-ENGINEERING FOR SMART TECHNOLOGIES (BIOSMART), 2016,
  • [17] Integration of speech and language processing in Chinese continuous speech recognition
    ZHAO Li ZOU Cairong WU Zhenyang(Department of Radio Engineering
    [J]. Chinese Journal of Acoustics, 2002, (04) : 343 - 351
  • [18] An Adaptive Speech Speed Algorithm for Improving Continuous Speech Recognition
    Zhu, Jinwei
    Chen, Huan
    Wen, Xing
    Huang, Zhenlin
    Zhao, Liuqi
    [J]. ACM International Conference Proceeding Series, 2023, : 606 - 610
  • [19] Prediction of emotional dimensions PAD for emotional speech recognition
    Sun, Ying
    Hu, Yan-Xiang
    Zhang, Xue-Ying
    Duan, Shu-Fei
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2019, 53 (10): : 2041 - 2048
  • [20] Audio-Visual Speech Modeling for Continuous Speech Recognition
    Dupont, Stephane
    Luettin, Juergen
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) : 141 - 151