Semi-Natural and Spontaneous Speech Recognition Using Deep Neural Networks with Hybrid Features Unification

被引:8
|
作者
Amjad, Ammar [1 ]
Khan, Lal [1 ]
Chang, Hsien-Tsung [1 ,2 ,3 ,4 ]
机构
[1] Chang Gung Univ, Dept Comp Sci & Informat Engn, Taoyuan 33302, Taiwan
[2] Chang Gung Mem Hosp, Dept Phys Med & Rehabil, Taoyuan 33302, Taiwan
[3] Chang Gung Univ, Artificial Intelligence Res Ctr, Taoyuan 33302, Taiwan
[4] Chang Gung Univ, Bachelor Program Artificial Intelligence, Taoyuan 33302, Taiwan
关键词
spontaneous database; semi-natural database; speech emotion recognition; multiple feature fusion; support vector machine; EMOTION RECOGNITION; INFORMATION; SPACE;
D O I
10.3390/pr9122286
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Recently, identifying speech emotions in a spontaneous database has been a complex and demanding study area. This research presents an entirely new approach for recognizing semi-natural and spontaneous speech emotions with multiple feature fusion and deep neural networks (DNN). A proposed framework extracts the most discriminative features from hybrid acoustic feature sets. However, these feature sets may contain duplicate and irrelevant information, leading to inadequate emotional identification. Therefore, an support vector machine (SVM) algorithm is utilized to identify the most discriminative audio feature map after obtaining the relevant features learned by the fusion approach. We investigated our approach utilizing the eNTERFACE05 and BAUM-1s benchmark databases and observed a significant identification accuracy of 76% for a speaker-independent experiment with SVM and 59% accuracy with, respectively. Furthermore, experiments on the eNTERFACE05 and BAUM-1s dataset indicate that the suggested framework outperformed current state-of-the-art techniques on the semi-natural and spontaneous datasets.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Recognizing Semi-Natural and Spontaneous Speech Emotions Using Deep Neural Networks
    Amjad, Ammar
    Khan, Lal
    Ashraf, Noman
    Mahmood, Muhammad Bilal
    Chang, Hsien-Tsung
    IEEE ACCESS, 2022, 10 : 37149 - 37163
  • [2] Emotion Recognition from Semi Natural Speech Using Artificial Neural Networks and Excitation Source Features
    Koolagudi, Shashidhar G.
    Devliyal, Swati
    Barthwal, Anurag
    Rao, K. Sreenivasa
    CONTEMPORARY COMPUTING, 2012, 306 : 273 - +
  • [3] Modulation spectral features for speech emotion recognition using deep neural networks
    Singh, Premjeet
    Sahidullah, Md
    Saha, Goutam
    SPEECH COMMUNICATION, 2023, 146 : 53 - 69
  • [4] Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks
    Rosario Campomanes-Alvarez, Blanca
    Quiros, Pelayo
    Fernandez, Bernardo
    APPLICATIONS OF INTELLIGENT SYSTEMS, 2018, 310 : 19 - 29
  • [5] Emotion recognition from speech using deep recurrent neural networks with acoustic features
    Byun, Sung-Woo
    Shin, Bo-Ra
    Lee, Seok-Pil
    Han, Hyuk-Soo
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2018, 123 : 43 - 44
  • [6] Emotional Speech Recognition Using Deep Neural Networks
    Trinh Van, Loan
    Dao Thi Le, Thuy
    Le Xuan, Thanh
    Castelli, Eric
    SENSORS, 2022, 22 (04)
  • [7] ARTICULATORY FEATURES FROM DEEP NEURAL NETWORKS AND THEIR ROLE IN SPEECH RECOGNITION
    Mitra, Vikramjit
    Sivaraman, Ganesh
    Nam, Hosung
    Espy-Wilson, Carol
    Saltzman, Elliot
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] Automatic Recognition of Kazakh Speech Using Deep Neural Networks
    Mamyrbayev, Orken
    Turdalyuly, Mussa
    Mekebayev, Nurbapa
    Alimhan, Keylan
    Kydyrbekova, Aizat
    Turdalykyzy, Tolganay
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT II, 2019, 11432 : 465 - 474
  • [9] Speech Recognition Using Deep Neural Networks: A Systematic Review
    Nassif, Ali Bou
    Shahin, Ismail
    Attili, Imtinan
    Azzeh, Mohammad
    Shaalan, Khaled
    IEEE ACCESS, 2019, 7 : 19143 - 19165
  • [10] Speech Emotion Recognition using MFCC and Hybrid Neural Networks
    Badr, Youakim
    Mukherjee, Partha
    Thumati, Sindhu
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE (IJCCI), 2021, : 366 - 373