F0, LPC, and MFCC Analysis for Emotion Recognition Based on Speech

被引:2
|
作者
Teixeira, Felipe L. [1 ,2 ,3 ]
Teixeira, Joao Paulo [2 ,3 ,4 ]
Soares, Salviano F. P. [1 ,5 ]
Pio Abreu, J. L. [6 ,7 ]
机构
[1] Engn Dept UTAD, Sch Sci & Technol, P-5000801 Vila Real, Portugal
[2] Inst Politecn Braganca, Res Ctr Digitalizat & Intelligent Robot CEDRI, P-5300253 Braganca, Portugal
[3] Inst Politecn Braganca, Lab Sustentabilidade & Tecnol Regioes Montanha Su, P-5300253 Braganca, Portugal
[4] Inst Politecn Braganca, Appl Management Res Unit UNIAG, P-5300253 Braganca, Portugal
[5] Inst Elect & Informat Engn Aveiro IEETA, P-3810193 Aveiro, Portugal
[6] Hosp Univ Coimbra, P-3004561 Coimbra, Portugal
[7] Univ Coimbra, Fac Med, P-3000548 Coimbra, Portugal
关键词
Emotional state; Speech; SVM; FEATURES; SELECTION; CLASSIFICATION;
D O I
10.1007/978-3-031-23236-7_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, research was done to understand what is needed to build a database to recognise emotions through speech. Some features that can highlight a good success rate for emotion recognition through speech were investigated. Also studied were some characteristics (symptoms) that can be associated with a specific emotional state. On the other hand, we also studied some features that can be used to identify some emotional states. A System Emotion Recognition (SER) was built with SVM, and the binary analysis was compared with a multi-category analysis. The binary analysis achieved an accuracy of 87.5% and the multi-class 42.6%. The parameters Fundamental Frequency-F0, Linear Predictive Coefficients (LPC), and Mel Frequency Cepstral Coeficients (MFCC) were used. The modest accuracy of this work was achieved using only F0, LPC and MFCC features.
引用
收藏
页码:389 / 404
页数:16
相关论文
共 50 条
  • [31] Generative modeling of speech F0 contours
    Kameoka, Hirokazu
    Yoshizato, Kota
    Ishihara, Tatsuma
    Ohishi, Yasunori
    Kashino, Kunio
    Sagayama, Shigeki
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1825 - 1829
  • [32] Dispersive analysis on the f0(600) and f0(980) resonances in γγ→π+π-, π0π0 processes
    Mao, Yu
    Wang, Xuan-Gong
    Zhang, Ou
    Zheng, H. Q.
    Zhou, Z. Y.
    [J]. PHYSICAL REVIEW D, 2009, 79 (11):
  • [33] F0 generation in a text-to-speech system using a database of natural F0 patterns
    da Silva, CH
    Nagle, EJ
    Runstein, F
    Violaro, F
    [J]. ITS '98 PROCEEDINGS - SBT/IEEE INTERNATIONAL TELECOMMUNICATIONS SYMPOSIUM, VOLS 1 AND 2, 1998, : 213 - 218
  • [34] Review of F0 modelling and generation in HMM based speech synthesis
    Yu, Kai
    [J]. PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 599 - 604
  • [36] Processing F0 with cochlear implants: Modulation frequency discrimination and speech intonation recognition
    Chatterjee, Monita
    Peng, Shu-Chen
    [J]. HEARING RESEARCH, 2008, 235 (1-2) : 143 - 156
  • [37] Emotion Conversion using F0 Segment Selection
    Inanoglu, Zeynep
    Young, Steve
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2122 - 2125
  • [38] Robust F0 estimation using ELS-based robust complex speech analysis
    Funaki, Keiichi
    Kinjo, Tatsuhiko
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2008, E91A (03) : 868 - 871
  • [39] Evaluation of F0 estimation using ZFR based on time-varying speech analysis
    Funaki, Keiichi
    Higa, Takehito
    [J]. 2012 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 2012), 2012,
  • [40] Emotion Recognition from Speech Using MFCC and DWT for Security System
    Saste, Sonali T.
    Jagdale, S. M.
    [J]. 2017 INTERNATIONAL CONFERENCE OF ELECTRONICS, COMMUNICATION AND AEROSPACE TECHNOLOGY (ICECA), VOL 1, 2017, : 701 - 704