Speech emotion recognition using semi-NMF feature optimization

被引:7
|
作者
Bandela, Surekha Reddy [1 ]
Kumar, T. Kishore [1 ]
机构
[1] NIT Warangal, Dept Elect & Commun Engn, Hanamkonda, Telangana, India
关键词
Speech emotion recognition; spectral; Teager energy operator; feature fusion; semi-nonnegative matrix factorization; k-nearest neighborhood; support vector machine; FEATURE-SELECTION; CLASSIFICATION; FREQUENCY;
D O I
10.3906/elk-1903-121
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent times, much research is progressing forward in the field of speech emotion recognition (SER). Many SER systems have been developed by combining different speech features to improve their performances. As a result, the complexity of the classifier increases to train this huge feature set. Additionally, some of the features could be irrelevant in emotion detection and this leads to a decrease in the emotion recognition accuracy. To overcome this drawback, feature optimization can be performed on the feature sets to obtain the most desirable emotional feature set before classifying the features. In this paper, semi-nonnegative matrix factorization (semi-NMF) with singular value decomposition (SVD) initialization is used to optimize the speech features. The speech features considered in this work are mel-frequency cepstral coefficients, linear prediction cepstral coefficients, and Teager energy operator-autocorrelation (TEO-AutoCorr). This work uses k-nearest neighborhood and support vector machine (SVM) for the classification of emotions with a 5-fold cross-validation scheme. The datasets considered for the performance analysis are EMO-DB and IEMOCAP. The performance of the proposed SER system using semi-NMF is validated in terms of classification accuracy. The results emphasize that the accuracy of the proposed SER system is improved remarkably upon using the semi-NMF algorithm for optimizing the feature sets compared to the baseline SER system without optimization.
引用
收藏
页码:3741 / 3757
页数:17
相关论文
共 50 条
  • [1] Representative and Discriminant Feature Extraction Based on NMF for Emotion Recognition in Speech
    Kim, Dami
    Lee, Soo-Young
    Amari, Shun-ichi
    NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2009, 5863 : 649 - +
  • [2] Max-Margin Semi-NMF
    Kumar, Vijay B. G.
    Kotsia, Irene
    Patras, Ioannis
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [3] Semi-NMF network for image classification
    Huang, Haonan
    Yang, Zuyuan
    Liang, Naiyao
    Li, Zhenni
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8899 - 8903
  • [4] Simultaneous Semi-NMF and PCA for Clustering
    Allab, Kais
    Labiod, Lazhar
    Nadif, Mohamed
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 679 - 684
  • [5] A Way to Boost Semi-NMF for Document Clustering
    Salah, Aghiles
    Ailem, Melissa
    Nadif, Mohamed
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2275 - 2278
  • [6] Unsupervised feature selection and NMF de-noising for robust Speech Emotion Recognition
    Bandela, Surekha Reddy
    Kumar, T. Kishore
    APPLIED ACOUSTICS, 2021, 172 (172)
  • [7] Speech emotion recognition using emotion perception spectral feature
    Jiang, Lin
    Tan, Ping
    Yang, Junfeng
    Liu, Xingbao
    Wang, Chao
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):
  • [8] Speech Emotion Recognition Using Speech Feature and Word Embedding
    Atmaja, Bagus Tris
    Shirai, Kiyoaki
    Akagi, Masato
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 519 - 523
  • [9] Acoustic feature analysis and optimization for Bangla speech emotion recognition
    Sultana, Sadia
    Rahman, Mohammad Shahidur
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2023, 44 (03) : 157 - 166
  • [10] Autoencoder-like semi-NMF multiple clustering
    Yao, Shihong
    Hu, Chuli
    Wang, Tao
    Cui, Xinyou
    INFORMATION SCIENCES, 2021, 572 : 331 - 342