Initial Analysis of the Impact of Emotional Speech on the Performance of Speaker Recognition on New Serbian Emotional Database

被引:1
|
作者
Mandaric, Igor [1 ]
Vujovic, Mia [1 ]
Suzic, Sinisa [1 ]
Nosek, Tijana [1 ]
Simic, Nikola [1 ]
Delic, Vlado [1 ]
机构
[1] Fac Tech Sci, Novi Sad, Serbia
关键词
acoustic features; speech database; machine learning; speaker recognition; IDENTIFICATION;
D O I
10.1109/TELFOR52709.2021.9653376
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the paper we compared three ML methods (kNN, SVM and MLP) to build the optimal models for speaker recognition for two datasets with different recording conditions. We studied the impact of different speech features on classification performance, with the main focus given to MFCCs. All models were built using neutral speech, but their performance on emotional test data is also analyzed. The achieved accuracy on speech in neutral style was similar to 99% for SEAC dataset and similar to 97% for VCTK. We observed a significant decrease in the results on emotional data. An improvement occured when other features from Interspeech 2009 feature set were added to MFCC in the model creation.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Impact of Emotional Speech to Automatic Speaker Recognition - Experiments on GEES Speech Database
    Jokic, Ivan
    Jokic, Stevan
    Delic, Vlado
    Peric, Zoran
    [J]. SPEECH AND COMPUTER, 2014, 8773 : 268 - 275
  • [2] Automatic emotional speech recognition in Serbian language
    Bojanic, Milana
    Delic, Vlado
    [J]. 2013 21ST TELECOMMUNICATIONS FORUM (TELFOR), 2013, : 459 - 465
  • [3] Creation and Analysis of Emotional Speech Database for Multiple Emotions Recognition
    Sato, Ryota
    Sasaki, Ryohei
    Suga, Norisato
    Furukawa, Toshihiro
    [J]. PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 33 - 37
  • [4] Emotion Attribute Projection for Speaker Recognition on Emotional Speech
    Bao, Huanjun
    Xu, Mingxing
    Zheng, Thomas Fang
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 601 - 604
  • [5] Emotional Speech Clustering based Robust Speaker Recognition System
    Li, Dongdong
    Yang, Yingchun
    [J]. PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 4576 - +
  • [6] New Features for Emotional Speech Recognition
    Palo, Hemanta Kumar
    Mohanty, Mihir Narayan
    Chandra, Mahesh
    [J]. 2015 IEEE POWER, COMMUNICATION AND INFORMATION TECHNOLOGY CONFERENCE (PCITC-2015), 2015, : 424 - 429
  • [7] Emotional Vocal Expressions Recognition Using the COST 2102 Italian Database of Emotional Speech
    Atassi, Hicham
    Riviello, Maria Teresa
    Smekal, Zdenek
    Hussain, Amir
    Esposito, Anna
    [J]. DEVELOPMENT OF MULTIMODAL INTERFACES: ACTIVE LISTING AND SYNCHRONY, 2010, 5967 : 255 - +
  • [8] Speaker Recognition in Emotional Environment
    Koolagudi, Shashidhar G.
    Sharma, Kritika
    Rao, K. Sreenivasa
    [J]. ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2012, 305 : 117 - +
  • [9] Applying Emotional Factor Analysis and I-Vector to Emotional Speaker Recognition
    Chen, Li
    Yang, Yingchun
    [J]. BIOMETRIC RECOGNITION: CCBR 2011, 2011, 7098 : 174 - 179
  • [10] Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
    Simic, Nikola
    Suzic, Sinisa
    Nosek, Tijana
    Vujovic, Mia
    Peric, Zoran
    Savic, Milan
    Delic, Vlado
    [J]. ENTROPY, 2022, 24 (03)