i-vector Algorithm with Gaussian Mixture Model for Efficient Speech Emotion Recognition

被引:9
|
作者
Gomes, Joan [1 ]
El-Sharkawy, Mohamed [1 ]
机构
[1] Indiana Univ Purdue Univ, Dept Elect & Comp Engn, Indianapolis, IN 46202 USA
关键词
Speech Emotion Recognition (SER); Gaussian Mixture Model (GMM); GMM Universal Background Model (UBM); Maximum A Posteriori (MAP) Adaptation; i-vector Algorithm; Formant Frequency;
D O I
10.1109/CSCI.2015.17
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotions constitute an essential part of our existence as it exerts great influence on the physical as well as mental health of people. Emotions often play the role of a sensitive catalyst, which fosters lively interaction between human beings. Over the past few decades the focus of researchers on study of the emotional content of speech signals, has progressively increased. Many systems have been proposed to make the Speech Emotion Recognition (SER) process more correct and accurate. The objective of our research is to classify speech emotion implementing a comparatively new method- i-vector model. i-vector model has found much success in the areas of speaker identification, speech recognition and language identification. But it has not been much explored in recognition of emotion. This paper discusses the design of a speech emotion recognition system considering three important aspects. Firstly, i-vector model was implemented in processing extracted features for speech representation. Secondly, an appropriate classification scheme was designed using Gaussian Mixture Model (GMM), Maximum A Posteriori (MAP) adaptation and i-vector algorithm. Finally, the performance of this new system was evaluated using emotional speech database. Speech emotions were identified with this novel system and also with a conventional system and results were compared, which proved that our proposed system can identify speech emotions with less error and more accuracy.
引用
收藏
页码:476 / 480
页数:5
相关论文
共 50 条
  • [1] SPEECH EMOTION RECOGNITION WITH I-VECTOR FEATURE AND RNN MODEL
    Zhang, Teng
    Wu, Ji
    [J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 524 - 528
  • [2] An i-vector GPLDA System for Speech based Emotion Recognition
    Gamage, Kalani Wataraka
    Sethu, Vidhyasaharan
    Phu Ngoc Le
    Ambikairajah, Eliathamby
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 289 - 292
  • [3] Emotion Recognition from Speech using Gaussian Mixture Model and Vector Quantization
    Agrawal, Surabhi
    Dongaonkar, Shabda
    [J]. 2015 4TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (ICRITO) (TRENDS AND FUTURE DIRECTIONS), 2015,
  • [4] Using i-Vector Space Model for Emotion Recognition
    Xia, Rui
    Liu, Yang
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2227 - 2230
  • [5] Emotion Recognition in I-vector Spaced
    Mackova, Lenka
    Cizmar, Anton
    Juhar, Jozef
    [J]. PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA 2016), 2016, : 372 - 375
  • [6] Speech emotion recognition using Gaussian mixture vector autoregressive models
    El Ayadi, Moataz M. H.
    Kamel, Mohamed S.
    Karray, Fakhri
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 957 - +
  • [7] The Research of Speech Emotion Recognition Based on Gaussian Mixture Model
    Zhang, Wanli
    Li, Guoxin
    Gao, Wei
    [J]. MECHANICAL COMPONENTS AND CONTROL ENGINEERING III, 2014, 668-669 : 1126 - +
  • [8] Efficient Gaussian mixture for speech recognition
    Zouari, Leila
    Chollet, Gerard
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 294 - +
  • [9] Robust i-vector based Adaptation of DNN Acoustic Model for Speech Recognition
    Garimella
    Mandal, Arindam
    Strom, Nikko
    Hoffmeister, Bjorn
    Matsoukas, Spyros
    Parthasarathi, Hari Krishnan
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2877 - 2881
  • [10] Variational Gaussian Mixture Models for Speech Emotion Recognition
    Mishra, Harendra Kumar
    Sekhar, C. Chandra
    [J]. ICAPR 2009: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, PROCEEDINGS, 2009, : 183 - 186