From Simulated Speech to Natural Speech, What are the Robust Features for Emotion Recognition?

被引:0
|
作者
Li, Ya [1 ]
Chao, Linlin [1 ]
Liu, Yazhu [1 ,2 ]
Bao, Wei [1 ,2 ]
Tao, Jianhua [1 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, NLPR, Beijing, Peoples R China
[2] Jiangsu Normal Univ, Inst Linguist Sci, Xuzhou, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Beijing Key Lab Mobile Comp & Pervas Device, Beijing, Peoples R China
关键词
emotion recognition; simulated emotion; natural emotion; robust feature selection; QUALITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The earliest research on emotion recognition starts with simulated/acted stereotypical emotional corpus, and then extends to elicited corpus. Recently, the demanding for real application forces the research shift to natural and spontaneous corpus. Previous research shows that accuracies of emotion recognition are gradual decline from simulated speech, to elicited and totally natural speech. This paper aims to investigate the effects of the common utilized spectral, prosody and voice quality features in emotion recognition with the three types of corpus, and finds out the robust feature for emotion recognition with natural speech. Emotion recognition by several common machine learning methods are carried out and thoroughly compared. Three feature selection methods are performed to find the robust features. The results on six common used corpora confirm that recognition accuracies decrease when the corpus changing from simulated to natural corpus. In addition, prosody and voice quality features are robust for emotion recognition on simulated corpus, while spectral feature is robust in elicited and natural corpus.
引用
收藏
页码:368 / 373
页数:6
相关论文
共 50 条
  • [1] Robust recognition of emotion from speech
    Hoque, Mohammed E.
    Yeasin, Mohammed
    Louwerse, Max M.
    [J]. INTELLIGENT VIRTUAL AGENTS, PROCEEDINGS, 2006, 4133 : 42 - 53
  • [2] Robust emotion recognition from speech: Gamma tone features and models
    Revathi A.
    Sasikaladevi N.
    Nagakrishnan R.
    Jeyalakshmi C.
    [J]. International Journal of Speech Technology, 2018, 21 (3) : 723 - 739
  • [3] Speech Databases, Speech Features, and Classifiers in Speech Emotion Recognition: A Review
    Mohmad Dar, G.H.
    Delhibabu, Radhakrishnan
    [J]. IEEE Access, 2024, 12 : 151122 - 151152
  • [4] Amplitude Modulation Features for Emotion Recognition from Speech
    Alam, Md Jahangir
    Attabi, Yazid
    Dumouchel, Pierre
    Kenny, Patrick
    O'Shaughnessy, D.
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2419 - 2423
  • [5] Evaluating intonational features for emotion recognition from speech
    Zervas, Panagiotis
    Mporas, Iosif
    Fakotakis, Nikos
    Kokkinakis, George
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2007, 16 (06) : 1001 - 1014
  • [6] Robust Features for Emotion Recognition from Speech by Using Gaussian Mixture Model Classification
    Navyasri, M.
    RajeswarRao, R.
    DaveeduRaju, A.
    Ramakrishnamurthy, M.
    [J]. INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS (ICTIS 2017) - VOL 2, 2018, 84 : 437 - 444
  • [7] Bispectral features and mean shift clustering for stress and emotion recognition from natural speech
    Yogesh, C. K.
    Hariharan, M.
    Yuvaraj, R.
    Ngadiran, Ruzelita
    Adom, A. H.
    Yaacob, Sazali
    Polat, Kemal
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2017, 62 : 676 - 691
  • [8] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
  • [9] NOT ALL FEATURES ARE EQUAL: SELECTION OF ROBUST FEATURES FOR SPEECH EMOTION RECOGNITION IN NOISY ENVIRONMENTS
    Leem, Seong-Gyun
    Fulford, Daniel
    Onnela, Jukka-Pekka
    Gard, David
    Busso, Carlos
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6447 - 6451
  • [10] Informative Speech Features based on Emotion Classes and Gender in Explainable Speech Emotion Recognition
    Yildirim, Huseyin Ediz
    Iren, Deniz
    [J]. 2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,