Feedforward Neural Network-Based Architecture for Predicting Emotions from Speech

被引:7
|
作者
Gavrilescu, Mihai [1 ]
Vizireanu, Nicolae [1 ]
机构
[1] Univ Politehn, Fac Elect Telecommun & Informat Technol, Dept Telecommun, Bucharest 060042, Romania
关键词
affective computing; speech analysis; emotion recognition; feedforward neural networks; machine learning; RECOGNITION; ENHANCEMENT; FEATURES; MODEL;
D O I
10.3390/data4030101
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a novel feedforward neural network (FFNN)-based speech emotion recognition system built on three layers: A base layer where a set of speech features are evaluated and classified; a middle layer where a speech matrix is built based on the classification scores computed in the base layer; a top layer where an FFNN- and a rule-based classifier are used to analyze the speech matrix and output the predicted emotion. The system offers 80.75% accuracy for predicting the six basic emotions and surpasses other state-of-the-art methods when tested on emotion-stimulated utterances. The method is robust and the fastest in the literature, computing a stable prediction in less than 78 s and proving attractive for replacing questionnaire-based methods and for real-time use. A set of correlations between several speech features (intensity contour, speech rate, pause rate, and short-time energy) and the evaluated emotions is determined, which enhances previous similar studies that have not analyzed these speech features. Using these correlations to improve the system leads to a 6% increase in accuracy. The proposed system can be used to improve human-computer interfaces, in computer-mediated education systems, for accident prevention, and for predicting mental disorders and physical diseases.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] A Neural Network Based Approach for Recognition of Basic Emotions from Speech
    Sham-E-Ansari, Md
    Disha, Shaminaj Towfika
    Chowdhury, Atiqul Islam
    Hasan, Md Khairul
    2020 IEEE REGION 10 SYMPOSIUM (TENSYMP) - TECHNOLOGY FOR IMPACTFUL SUSTAINABLE DEVELOPMENT, 2020, : 807 - 810
  • [2] A neural network-based feedforward architecture for recovering 3-D motion information of curved surfaces
    Sun, Y
    Bayoumi, MM
    ISSPA 96 - FOURTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 553 - 556
  • [3] Style Transplantation in Neural Network-based Speech Synthesis
    Suzic, Sinisa B.
    Delic, Tijana, V
    Pekar, Darko J.
    Delic, Vlado D.
    Secujski, Milan S.
    ACTA POLYTECHNICA HUNGARICA, 2019, 16 (06) : 171 - 189
  • [4] Neural network-based artificial bandwidth expansion of speech
    Kontio, Juho
    Laaksonen, Laura
    Alku, Paavo
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 873 - 881
  • [5] Integrating Uncertainty Into Neural Network-Based Speech Enhancement
    Fang, Huajian
    Becker, Dennis
    Wermter, Stefan
    Gerkmann, Timo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1587 - 1600
  • [6] Synthesis of neural network-based approximators with heterogeneous architecture
    Markin, MI
    PROGRAMMING AND COMPUTER SOFTWARE, 2003, 29 (04) : 219 - 227
  • [7] Synthesis of Neural Network-Based Approximators with Heterogeneous Architecture
    M. I. Markin
    Programming and Computer Software, 2003, 29 : 219 - 227
  • [8] A Recurrent Neural Network-Based Approach to Automatic Language Identification from Speech
    Mukherjee, Himadri
    Dhar, Ankita
    Obaidullah, Sk Md
    Santosh, K. C.
    Phadikar, Santanu
    Roy, Kaushik
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, DEVICES AND COMPUTING, 2020, 602 : 441 - 450
  • [9] Modeling of Feedforward Neural Network in PAHRA Architecture
    Vokorokos, Liberios
    Adam, Norbert
    PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON SIMULATION, MODELLING AND OPTIMIZATION, 2009, : 446 - +
  • [10] Optimizing feedforward artificial neural network architecture
    Benardos, P. G.
    Vosniakos, G. -C.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (03) : 365 - 382