An Innovative Method for Speech Signal Emotion Recognition Based on Spectral Features Using GMM and HMM Techniques

被引:0
|
作者
Mohammed Jawad Al-Dujaili Al-Khazraji
Abbas Ebrahimi-Moghadam
机构
[1] University of Kufa,Departement of Electronic and Communication, Faculty of Engineering
[2] Ferdowsi University of Mashhad,Electrical Engineering Department Faculty of Engineering
来源
关键词
Emotion recognition; Speech; MFCC; LPC; PLP; PCA; GMM; HMM;
D O I
暂无
中图分类号
学科分类号
摘要
Speech is one of the communication processes of humans. One of the important features of speech is to convey the inner feelings of the person to the listener. When a speech is expressed by the speaker, this speech also contains the feelings of the person, which leads to the creation of thoughts and behaviors appropriate to oneself. Speech Emotion Recognition (SER) is a very important issue in the field of human–machine interaction. The expansion of the use of computers and its impact on today's life has caused this mutual cooperation between man and machine to be widely investigated and researched. In this article, SER in English and Persian has been examined. Frequency time characteristics such as Mel- Frequency Cepstral Coefficient (MFCC), Linear Predictive Coding and Predictive Linear Perceptual (PLP) are extracted from the data as feature vectors, then they are combined with each other and a selection of suitable features from them. Also, Principal components analysis (PCA) is used to reduce dimensions and eliminate redundancy while retaining most of the intrinsic information content of the pattern. Then, each emotional state was classified using the Gaussian Mixtures Model (GMM) and Hidden Markov Model (HMM) technique. Combining the MFCC + PLP properties, PCA features, and HMM classification with a precision of 88.85% and a runtime of 0.3 s produces the average diagnostic rate in the English database; similarly, the PLP properties, PCA features, and HMM classification with a precision of 90.21% and a runtime of 0.4 s produce the average diagnostic rate in the Persian database. Based on the combination of features and classifications, the experimental results demonstrated that the suggested approach can attain a high level of stable detection performance for every emotional state.
引用
收藏
页码:735 / 753
页数:18
相关论文
共 50 条
  • [21] Comparison of acoustical models of GMM-HMM based for speech recognition in Hindi using PocketSphinx
    Manasa, Chadalavada Sai
    Priya, K. Jeeva
    Gupta, Deepa
    [J]. PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 534 - 539
  • [22] Speech Emotion Recognition Based on Wavelet Transform and Improved HMM
    Han Zhiyan
    Wang Jian
    [J]. 2013 25TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2013, : 3156 - 3159
  • [23] Peripheral features for HMM-based speech recognition
    Fukuda, T
    Takigawa, M
    Nitta, T
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 129 - 132
  • [24] Automatic speech based emotion recognition using paralinguistics features
    Hook, J.
    Noroozi, F.
    Toygar, O.
    Anbarjafari, G.
    [J]. BULLETIN OF THE POLISH ACADEMY OF SCIENCES-TECHNICAL SCIENCES, 2019, 67 (03) : 479 - 488
  • [25] Modulation spectral features for speech emotion recognition using deep neural networks
    Singh, Premjeet
    Sahidullah, Md
    Saha, Goutam
    [J]. SPEECH COMMUNICATION, 2023, 146 : 53 - 69
  • [26] Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features
    Abdel-Hamid, Lamiaa
    [J]. SPEECH COMMUNICATION, 2020, 122 : 19 - 30
  • [27] Emotion Recognition Using Prosodic and Spectral Features of Speech and Naive Bayes Classifier
    Khan, Atreyee
    Roy, Uttam Kumar
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 1017 - 1021
  • [28] Emotion Recognition from Speech Signals using Excitation Source and Spectral Features
    Choudhury, Akash Roy
    Ghosh, Anik
    Pandey, Rahul
    Barman, Subhas
    [J]. PROCEEDINGS OF 2018 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON), 2018, : 257 - 261
  • [29] Weighted spectral features based on local Hu moments for speech emotion recognition
    Sun, Yaxin
    Wen, Guihua
    Wang, Jiabing
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2015, 18 : 80 - 90
  • [30] Spectral Features Based on Local Normalized Center Moments for Speech Emotion Recognition
    Tao, Huawei
    Liang, Ruiyu
    Zhang, Xinran
    Zhao, Li
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (10) : 1863 - 1866