Prosodic, spectral and voice quality feature selection using a long-term stopping criterion for audio-based emotion recognition

被引:14
|
作者
Kaechele, Markus [1 ]
Zharkov, Dimitrij [1 ]
Meudt, Sascha [1 ]
Schwenker, Friedhelm [1 ]
机构
[1] Univ Ulm, Inst Neural Informat Proc, D-89069 Ulm, Germany
关键词
CLASSIFIER SYSTEMS;
D O I
10.1109/ICPR.2014.148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition from speech is an important field of research in human-machine-interfaces, and has begun to influence everyday life by employment in different areas such as call centers or wearable companions in the form of smartphones. In the proposed classification architecture, different spectral, prosodic and the relatively novel voice quality features are extracted from the speech signals. These features are then used to represent long-term information of the speech, leading to utterance-wise suprasegmental features. The most promising of these features are selected using a forward-selection/backward-elimination algorithm with a novel long-term termination criterion for the selection. The overall system has been evaluated using recordings from the public Berlin emotion database. Utilizing the resulted features, a recognition rate of 88,97% has been achieved which surpasses the performance of humans on this database and is comparable to the state of the art performance on this dataset.
引用
收藏
页码:803 / 808
页数:6
相关论文
共 50 条
  • [1] Audio-Based Emotion Recognition Using Self-Supervised Learning on an Engineered Feature Space
    Nimitsurachat, Peranut
    Washington, Peter
    AI, 2024, 5 (01) : 195 - 207
  • [2] Using the Fisher Vector Representation for Audio-based Emotion Recognition
    Gosztolya, Gabor
    ACTA POLYTECHNICA HUNGARICA, 2020, 17 (06) : 7 - 23
  • [3] Human Voice Emotion Identification Using Prosodic and Spectral Feature Extraction Based on Deep Neural Networks
    Gumelar, Agustinus Bimo
    Kurniawan, Afid
    Sooai, Adri Gabriel
    Purnomo, Mauridhi Hery
    Yuniarno, Eko Mulyanto
    Sugiarto, Indar
    Widodo, Agung
    Kristanto, Andreas Agung
    Fahrudin, Tresna Maulana
    2019 IEEE 7TH INTERNATIONAL CONFERENCE ON SERIOUS GAMES AND APPLICATIONS FOR HEALTH (SEGAH), 2019,
  • [4] Audio-based Emotion Recognition using GMM Supervector an SVM Linear Kernel
    Dinh-Son Tran
    Yang, Hyung-Jeong
    Kim, Soo-Hyung
    Lee, Guee Sang
    Luu-Ngoc Do
    Ngoc-Huynh Ho
    Van Quan Nguyen
    2ND INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND SOFT COMPUTING (ICMLSC 2018), 2015, : 169 - 173
  • [5] Deep Emotion Recognition using Prosodic and Spectral Feature Extraction and Classification based on Cross Validation and Bootstrap
    Sharma, Ayush
    Anderson, David V.
    2015 IEEE SIGNAL PROCESSING AND SIGNAL PROCESSING EDUCATION WORKSHOP (SP/SPE), 2015, : 421 - 425
  • [6] Comparison of Feature Selection Methods in Voice Based Emotion Recognition Systems
    Atalay, Tolga
    Ayata, Deger
    Yaslan, Yusuf
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [7] Using Prosodic Phrase-Based VQVAE on Audio ALBERT for Speech Emotion Recognition
    Hsu, Jia-Hao
    Wu, Chung-Hsien
    Yang, Tsung-Hsien
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 415 - 419
  • [8] Pashto Spoken Digits Recognition Using Spectral and Prosodic Based Feature Extraction
    Nisar, Shibli
    Shahzad, Ibrahim
    Khan, Muhammad Adnan
    Tariq, Muhammad
    2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2017, : 74 - 78
  • [9] Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Features
    Starlet Ben Alex
    Leena Mary
    Ben P. Babu
    Circuits, Systems, and Signal Processing, 2020, 39 : 5681 - 5709
  • [10] ReGA Based Feature Selection Emotion Recognition Using EEG Signals
    Kong, Yonghui
    Yan, Jianzhou
    Xu, Hongxia
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 6588 - 6593