Emotional speech classification using Gaussian mixture models

被引:20
|
作者
Ververidis, D [1 ]
Kotropoulos, C [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki 54124, Greece
关键词
D O I
10.1109/ISCAS.2005.1465226
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, the classification of utterances into five basic emotional states is studied. A total of 87 statistical characteristics of pitch, energy, and formants is extracted from 500 utterances of the Danish Emotional Speech database. An evaluation of the classification capability of each feature is performed with respect to the probability of correct classification achieved by the Bayes classifier that models the feature probability density function as a mixture of Gaussian densities. Next, the feature subset that yields the highest probability of correct classification is found using the Sequential Floating Forward Selection algorithm. The probability of correct classification is estimated via crossvalidation and the probability density functions are modelled as mixtures of 2 or 3 Gaussian densities. The results demonstrate that the Bayes classifier which employs mixtures of 2 Gaussian densities can achieve a probability of correct classification equal to 0.55, whereas the human classification score is 0.67 for the database considered and the random classification would give a probability of correct classification equal to 0.20.
引用
收藏
页码:2871 / 2874
页数:4
相关论文
共 50 条
  • [31] Unsupervised classification with non-Gaussian mixture models using ICA
    Lee, TW
    Lewicki, MS
    Sejnowski, T
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 508 - 514
  • [32] Antenna Classification Using Gaussian Mixture Models (GMM) and Machine Learning
    Ma, Yihan
    Hao, Yang
    [J]. IEEE OPEN JOURNAL OF ANTENNAS AND PROPAGATION, 2020, 1 (01): : 320 - 328
  • [33] Traffic Classification and Verification using Unsupervised Learning of Gaussian Mixture Models
    Alizadeh, Hassan
    Khoshrou, Abdolrahman
    Zuquete, Andre
    [J]. 2015 IEEE INTERNATIONAL WORKSHOP ON MEASUREMENTS AND NETWORKING (M&N), 2015, : 94 - 99
  • [34] Secure sound classification: Gaussian mixture models
    Shashanka, Madhusudana V. S.
    Smaragdis, Paris
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 3539 - 3542
  • [35] Vowel Recognition from Telephonic Speech Using MFCCs and Gaussian Mixture Models
    Koolagudi, Shashidhar G.
    Thakur, Sujata Negi
    Barthwal, Anurag
    Singh, Manoj Kumar
    Rawat, Ramesh
    Rao, K. Sreenivasa
    [J]. ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2012, 305 : 170 - +
  • [36] REALTIME SPEECH-DRIVEN FACIAL ANIMATION USING GAUSSIAN MIXTURE MODELS
    Luo, Changwei
    Yu, Jun
    Li, Xian
    Wang, Zengfu
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2014,
  • [37] Analysis of Lombard and Angry Speech Using Gaussian Mixture Models and KL Divergence
    Mittal, Shubham
    Vyas, Swati
    Prasanna, S. R. M.
    [J]. 2013 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2013,
  • [38] Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification
    Yun, Sungrack
    Yoo, Chang D.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 585 - 598
  • [39] Complementary Gaussian Mixture Models for Multimodal Speech Recognition
    Sad, Gonzalo D.
    Terissi, Lucas D.
    Gomez, Juan C.
    [J]. MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, 2015, 8869 : 54 - 65
  • [40] Gaussian mixture models of phonetic boundaries for speech recognition
    Omar, MK
    Hasegawa-Johnson, M
    Levinson, S
    [J]. ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 33 - 36