Improving Speech Emotion Recognition System Using Spectral and Prosodic Features

被引:1
|
作者
Chakhtouna, Adil [1 ]
Sekkate, Sara [2 ]
Adib, Abdellah [1 ]
机构
[1] Hassan II Univ Casablanca, Fac Sci & Technol Mohammedia, MCSA Lab, Team Comp Sci Artificial Intelligence & Big Data, Casablanca, Morocco
[2] Higher Natl Sch Arts & Crafts Casablanca, Casablanca, Morocco
关键词
Speech emotion recognition; Machine learning; Prosodic and spectral features; Feature selection; SVM; KNN;
D O I
10.1007/978-3-030-96308-8_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The detection of emotions from speech is a key aspect of all human behaviors, Speech Emotion Recognition (SER) plays an extensive role in a diverse range of applications, especially in human-computer communication. The main aim of this study is to build two Machine Learning (ML) models able to classify the input speech into several classes of emotions. In contrast, we extract a set of prosodic and spectral features from sound files and apply a feature selection method to improve the SER rate of the proposed system. Experiments are being done to evaluate the accuracy of the emotional speech system with the use of the RAVDESS database. We performed the efficiency of our models and compared them to the existing literature for SER. Our obtained results indicate that the proposed system based on Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) achieves a test accuracy of 69.67% and 65.04% respectively with 8 emotional states.
引用
收藏
页码:399 / 409
页数:11
相关论文
共 50 条
  • [1] A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features
    Zhou, Yu
    Li, Junfeng
    Sun, Yanqing
    Zhang, Jianping
    Yan, Yonghong
    Akagi, Masato
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (10) : 2813 - 2821
  • [2] Emotion Recognition Using Prosodic and Spectral Features of Speech and Naive Bayes Classifier
    Khan, Atreyee
    Roy, Uttam Kumar
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 1017 - 1021
  • [4] Emotion recognition from speech using source, system, and prosodic features
    Koolagudi, Shashidhar G.
    Rao, K. Sreenivasa
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 265 - 289
  • [5] Hierarchical emotion recognition from speech using source, power spectral and prosodic features
    Arijul Haque
    K. Sreenivasa Rao
    [J]. Multimedia Tools and Applications, 2024, 83 : 19629 - 19661
  • [6] Hierarchical emotion recognition from speech using source, power spectral and prosodic features
    Haque, Arijul
    Rao, K. Sreenivasa
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) : 19629 - 19661
  • [7] PERFORMANCE ANALYSIS OF SPECTRAL AND PROSODIC FEATURES AND THEIR FUSION FOR EMOTION RECOGNITION IN SPEECH
    Gaurav, Manish
    [J]. 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 313 - 316
  • [8] Emotion Recognition from Speech using Prosodic and Linguistic Features
    Pervaiz, Mahwish
    Khan, Tamim Ahmed
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (08) : 84 - 90
  • [9] Emotion recognition from speech using global and local prosodic features
    Rao K.S.
    Koolagudi S.G.
    Vempada R.R.
    [J]. International Journal of Speech Technology, 2013, 16 (2) : 143 - 160
  • [10] Emotion recognition from speech using wavelet packet transform and prosodic features
    Gupta, Manish
    Bharti, Shambhu Shankar
    Agarwal, Suneeta
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (02) : 1541 - 1553