Effective ensembling classification strategy for voice and emotion recognition

被引:0
|
作者
Yasser Alharbi
机构
[1] University of Hail,College of Computer Science and Engineering
关键词
VER; Recognition; CapsNet; R-LSTM; Ensemble learning; Accuracy; Emotion;
D O I
暂无
中图分类号
学科分类号
摘要
Nowadays, Machine learning techniques are found to be unique among the most effective approaches for Voice and Emotion Recognition (VER). Moreover, automatic recognition of voice and emotions is essential for smooth psychosocial interactions between humans and machines. There have been huge strides in creating workable pieces of art that combine spectrogram and deep learning characteristics in the VER research. On the other hand, although single Machine Learning (ML) methods deliver acceptable results, it's not quite reaching the standards yet. This necessitates the development of strategies that use various ML techniques, target multiple aspects and elements of voice recognition. This article proposes an ensembling classifier model that incorporates the outcome of base classifiers (CapsNet and RNNs) for VER. The CapsNet model can identify the spatial correlation of vital speech information in spectrograms using a pooling technique. The RNN, on the other hand, is excellent for processing time-series datasets, and both are well known for their performance in classification work. Stacked generalization is used for constructing ensemble classifiers that integrate predictions made by CapsNet and RNN classifiers. As much as 96.05% of overall accuracy is obtained when using this ensemble approach, which is more effective than either CapsNets or RNN when individually compared. One of the significant benefits of the proposed classifier is that it effectively detects the emotional class 'FEAR', with a recognition rate of 96.68% among seven other classes.
引用
收藏
页码:334 / 345
页数:11
相关论文
共 50 条
  • [31] Hierarchical Classification Approach to Emotion Recognition in Twitter
    Esmin, Ahmed A. A.
    de Oliveira, Roberto L., Jr.
    Matwin, Stan
    2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 381 - 385
  • [32] Recognition of Voice Emotion in School Aged Children with Cochlear Implants
    Kim, Mi-Young
    Yoon, Mi-Sun
    COMMUNICATION SCIENCES AND DISORDERS-CSD, 2018, 23 (04): : 1102 - 1110
  • [33] AN ENSEMBLE FRAMEWORK OF VOICE-BASED EMOTION RECOGNITION SYSTEM
    Tao, Fei
    Liu, Gang
    Zhao, Qingen
    2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [34] Cross-regional cultural recognition of adolescent voice emotion
    Cheng, Shanshan
    Li, Yue
    Wang, Yingying
    Zhang, Yin
    FRONTIERS IN PSYCHOLOGY, 2024, 15
  • [35] A Study on Speech Emotion Recognition in the Context of Voice User Experience
    Demaeght, Annebeth
    Nerb, Josef
    Mueller, Andrea
    HCI IN BUSINESS, GOVERNMENT AND ORGANIZATIONS, PT II, HCIBGO 2024, 2024, 14721 : 174 - 188
  • [36] Audio feature extraction for effective emotion classification
    Han E.
    Cha H.
    IEIE Transactions on Smart Processing and Computing, 2019, 8 (02): : 100 - 107
  • [37] The relevance of voice quality features in speaker independent emotion recognition
    Lugger, Marko
    Yang, Bin
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 17 - +
  • [38] Emotion Recognition Research based on Integration of Facial Expression and Voice
    Xu, Fen
    Wang, Zhe
    2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2018), 2018,
  • [39] Speech Emotion Recognition Based on Minimal Voice Quality Features
    Jacob, Agnes
    2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 886 - 890
  • [40] EEG Emotion Classification Based On Baseline Strategy
    Xu, Jinghan
    Ren, Fuji
    Bao, Yanwei
    PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 43 - 46