Emotional speech classification using Gaussian mixture models

被引:20
|
作者
Ververidis, D [1 ]
Kotropoulos, C [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki 54124, Greece
关键词
D O I
10.1109/ISCAS.2005.1465226
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, the classification of utterances into five basic emotional states is studied. A total of 87 statistical characteristics of pitch, energy, and formants is extracted from 500 utterances of the Danish Emotional Speech database. An evaluation of the classification capability of each feature is performed with respect to the probability of correct classification achieved by the Bayes classifier that models the feature probability density function as a mixture of Gaussian densities. Next, the feature subset that yields the highest probability of correct classification is found using the Sequential Floating Forward Selection algorithm. The probability of correct classification is estimated via crossvalidation and the probability density functions are modelled as mixtures of 2 or 3 Gaussian densities. The results demonstrate that the Bayes classifier which employs mixtures of 2 Gaussian densities can achieve a probability of correct classification equal to 0.55, whereas the human classification score is 0.67 for the database considered and the random classification would give a probability of correct classification equal to 0.20.
引用
收藏
页码:2871 / 2874
页数:4
相关论文
共 50 条
  • [1] Emotional speech classification using Gaussian mixture models and the sequential floating forward selection algorithm
    Ververidis, D
    Kotropoulos, C
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 1501 - 1504
  • [2] Real Life Emotion Classification from Speech Using Gaussian Mixture Models
    Koolagudi, Shashidhar G.
    Barthwal, Anurag
    Devliyal, Swati
    Rao, K. Sreenivasa
    [J]. CONTEMPORARY COMPUTING, 2012, 306 : 250 - +
  • [3] Classification of stressed speech using Gaussian mixture model
    Patro, H
    Raja, GS
    Dandapat, S
    [J]. INDICON 2005 Proceedings, 2005, : 342 - 346
  • [4] Speech Enhancement Using Gaussian Scale Mixture Models
    Hao, Jiucang
    Lee, Te-Won
    Sejnowski, Terrence J.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1127 - 1136
  • [5] Waveform quantization of speech using Gaussian mixture models
    Samuelsson, J
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 165 - 168
  • [6] Classification and compression of ICEGS using gaussian mixture models
    Coggins, R
    Jabri, M
    [J]. NEURAL NETWORKS FOR SIGNAL PROCESSING VII, 1997, : 226 - 235
  • [7] Using Wavelets and Gaussian Mixture Models for Audio Classification
    Chuan, Ching-Hua
    Vasana, Susan
    Asaithambi, Asai
    [J]. 2012 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2012, : 421 - 426
  • [8] Classification of facial images using Gaussian mixture models
    Liao, P
    Gao, W
    Shen, L
    Chen, XL
    Shan, SG
    Zeng, WB
    [J]. ADVANCES IN MUTLIMEDIA INFORMATION PROCESSING - PCM 2001, PROCEEDINGS, 2001, 2195 : 724 - 731
  • [9] Distribution based classification using Gaussian Mixture Models
    Gudnason, J
    Brookes, M
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4159 - 4159
  • [10] Age Approximation from Speech using Gaussian Mixture Models
    Mittal, Tanushri
    Barthwal, Anurag
    Koolagudi, Shashidhar G.
    [J]. 2013 SECOND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, NETWORKING AND SECURITY (ADCONS 2013), 2013, : 74 - 78