Initial Analysis of the Impact of Emotional Speech on the Performance of Speaker Recognition on New Serbian Emotional Database

被引:1
|
作者
Mandaric, Igor [1 ]
Vujovic, Mia [1 ]
Suzic, Sinisa [1 ]
Nosek, Tijana [1 ]
Simic, Nikola [1 ]
Delic, Vlado [1 ]
机构
[1] Fac Tech Sci, Novi Sad, Serbia
关键词
acoustic features; speech database; machine learning; speaker recognition; IDENTIFICATION;
D O I
10.1109/TELFOR52709.2021.9653376
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the paper we compared three ML methods (kNN, SVM and MLP) to build the optimal models for speaker recognition for two datasets with different recording conditions. We studied the impact of different speech features on classification performance, with the main focus given to MFCCs. All models were built using neutral speech, but their performance on emotional test data is also analyzed. The achieved accuracy on speech in neutral style was similar to 99% for SEAC dataset and similar to 97% for VCTK. We observed a significant decrease in the results on emotional data. An improvement occured when other features from Interspeech 2009 feature set were added to MFCC in the model creation.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] Emotional Speech Synthesis for Multi-Speaker Emotional Dataset Using WaveNet Vocoder
    Choi, Heejin
    Park, Sangjun
    Park, Jinuk
    Hahn, Minsoo
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2019,
  • [42] How to categorize emotional speech signals with respect to the speaker's degree of emotional intensity
    Karimi, Salman
    Sedaaghi, Mohammad Hossein
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (03) : 1306 - 1324
  • [43] Generative emotional AI for speech emotion recognition: The case for synthetic emotional speech augmentation
    Latif, Siddique
    Shahid, Abdullah
    Qadir, Junaid
    [J]. APPLIED ACOUSTICS, 2023, 210
  • [44] Erratum to: Recognizing emotional speech in Persian: A validated database of Persian emotional speech (Persian ESD)
    Niloofar Keshtiari
    Michael Kuhlmann
    Moharram Eslami
    Gisela Klann-Delius
    [J]. Behavior Research Methods, 2015, 47 : 295 - 295
  • [45] Recognition of Emotional States in Natural Speech
    Kaminska, Dorota
    Sapinski, Tomasz
    Pelikant, Adam
    [J]. 2013 SIGNAL PROCESSING SYMPOSIUM (SPS), 2013,
  • [46] Deep Learning for Emotional Speech Recognition
    Sanchez-Gutierrez, Maximo E.
    Marcelo Albornoz, E.
    Martinez-Licona, Fabiola
    Leonardo Rufiner, H.
    Goddard, John
    [J]. PATTERN RECOGNITION, MCPR 2014, 2014, 8495 : 311 - +
  • [47] Deep Learning for Emotional Speech Recognition
    Alhamada, M., I
    Khalifa, O. O.
    Abdalla, A. H.
    [J]. PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC DEVICES, SYSTEMS AND APPLICATIONS (ICEDSA2020), 2020, 2306
  • [48] Dimensionality Reduction for Emotional Speech Recognition
    Fewzee, Pouria
    Karray, Fakhri
    [J]. PROCEEDINGS OF 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK AND TRUST AND 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM/PASSAT 2012), 2012, : 532 - 537
  • [49] An emotional speech synthesis markup language processor for multi-speaker and emotional text-to-speech applications
    Ryu, Se-Hui
    Cho, Hee
    Lee, Ju-Hyun
    Hong, Ki-Hyung
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 523 - 529
  • [50] Improved Emotional Speech Recognition Algorithms
    Rajeswari, A.
    Sowmbika, P.
    Kalaimagal, P.
    Ramya, M.
    Ranjitha, M.
    [J]. PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 2362 - 2366