Emotional Speech Recognition of Holocaust Survivors with Deep Neural Network Models for Russian Language

被引:0
|
作者
Bukreeva, Liudmila [1 ]
Guseva, Daria [1 ]
Dolgushin, Mikhail [1 ]
Evdokimova, Vera [1 ]
Obotnina, Vasilisa [1 ]
机构
[1] St Petersburg State Univ, Univ Skaya Emb 7-9, St Petersburg 199034, Russia
来源
关键词
Question Answering; Corpora; Visual History Archives;
D O I
10.1007/978-3-031-48309-7_6
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognition of highly emotional speech remains a challenging case of automatic speech recognition task. The aim of this article is to carry out experiments on highly emotional speech recognition by investigating oral history archives provided by the Yad Vashem foundation. The material consists of elderly peoples' emotional speech full of accents and common language. We analyze and preprocess 26 h of publicly available video interviews with Holocaust survivors. Our objective was to develop a system able to perform emotional speech recognition based on deep neural network models. We present and evaluate the obtained results that contribute to the research field of oral history archives.
引用
收藏
页码:68 / 76
页数:9
相关论文
共 50 条
  • [31] Indonesian speech recognition based on Deep Neural Network
    Yang, Ruolin
    Yang, Jian
    Lu, Yu
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 36 - 41
  • [32] Speech Emotion Recognition Based on Deep Neural Network
    Zhu, Zijiang
    Hu, Yi
    Li, Junshan
    Li, Jianjun
    Wang, Junhua
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 154 - 154
  • [33] Deep Neural Network Based Speech Separation for Robust Speech Recognition
    Tu Yanhui
    Jun, Du
    Xu Yong
    Dai Lirong
    Chin-Hui, Lee
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 532 - 536
  • [34] Comparison of Neural Network Models for Speech Emotion Recognition
    Palo, Hemanta Kumar
    Sagar, Sangeet
    2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND BUSINESS ANALYTICS (ICDSBA 2018), 2018, : 127 - 131
  • [35] LIMITED-MEMORY BFGS OPTIMIZATION OF RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR SPEECH RECOGNITION
    Liu, Xunying
    Liu, Shansong
    Sha, Jinze
    Yu, Jianwei
    Xu, Zhiyuan
    Chen, Xie
    Meng, Helen
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6114 - 6118
  • [36] Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition
    Xu, Junhao
    Yu, Jianwei
    Hu, Shoukang
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3679 - 3693
  • [37] Neural candidate-aware language models for speech recognition
    Tanaka, Tomohiro
    Masumura, Ryo
    Oba, Takanobu
    COMPUTER SPEECH AND LANGUAGE, 2021, 66
  • [38] DOMAIN-AWARE NEURAL LANGUAGE MODELS FOR SPEECH RECOGNITION
    Liu, Linda
    Gu, Yile
    Gourav, Aditya
    Gandhe, Ankur
    Kalmane, Shashank
    Filimonov, Denis
    Rastrow, Ariya
    Bulyko, Ivan
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7373 - 7377
  • [39] Neural Error Corrective Language Models for Automatic Speech Recognition
    Tanaka, Tomohiro
    Masumura, Ryo
    Masataki, Hirokazu
    Aono, Yushi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 401 - 405
  • [40] A Unified Deep Neural Network for Speaker and Language Recognition
    Richardson, Fred
    Reynolds, Doug
    Dehak, Najim
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1146 - 1150