Transfer learning based convolution neural net for authentication and classification of emotions from natural and stimulated speech signals

被引:4
|
作者
Kumar, Mukul [1 ]
Katyal, Nipun [1 ]
Ruban, Nersisson [1 ]
Lyakso, Elena [2 ]
Mekala, A. Mary [3 ]
Raj, Alex Noel Joseph [4 ]
Richard, G. Maarc [1 ]
机构
[1] Vellore Inst Technol Vellore, Sch Elect Engn, Vellore, Tamil Nadu, India
[2] St Petersburg State Univ, St Petersburg, Russia
[3] Vellore Inst Technol Vellore, Sch Informat Technol & Engn, Vellore, Tamil Nadu, India
[4] Shantou Univ, Coll Engn, Dept Elect Engn, Key Lab Digital Signal & Image Proc Guangdong Pro, Shantou, Peoples R China
关键词
Deep learning; speech fidelity classification; linear prediction cepstral coefficients (LPCC); mel frequency cepstral coefficients (MFCC); speech emotion recognition; RECOGNITION; FREQUENCY;
D O I
10.3233/JIFS-210711
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the years the need for differentiating various emotions from oral communication plays an important role in emotion based studies. There have been different algorithms to classify the kinds of emotion. Although there is no measure of fidelity of the emotion under consideration, which is primarily due to the reason that most of the readily available datasets that are annotated are produced by actors and not generated in real-world scenarios. Therefore, the predicted emotion lacks an important aspect called authenticity, which is whether an emotion is actual or stimulated. In this research work, we have developed a transfer learning and style transfer based hybrid convolutional neural network algorithm to classify the emotion as well as the fidelity of the emotion. The model is trained on features extracted from a dataset that contains stimulated as well as actual utterances. We have compared the developed algorithm with conventional machine learning and deep learning techniques by few metrics like accuracy, Precision, Recall and F1score. The developed model performs much better than the conventional machine learning and deep learning models. The research aims to dive deeper into human emotion and make a model that understands it like humans do with precision, recall, Fl score values of 0.994, 0.996, 0.995 for speech authenticity and 0.992, 0.989, 0.99 for speech emotion classification respectively.
引用
收藏
页码:2013 / 2024
页数:12
相关论文
共 50 条
  • [21] Classification of Human Emotions from EEG Signals using Statistical Features and Neural Network
    Yuen, Chai Tong
    San, Woo San
    Rizon, Mohamed
    Seong, Tan Ching
    INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2009, 1 (03): : 71 - 79
  • [22] Classification of Post-COVID-19 Emotions with Residual-Based Separable Convolution Networks and EEG Signals
    Abbas, Qaisar
    Baig, Abdul Rauf
    Hussain, Ayyaz
    SUSTAINABILITY, 2023, 15 (02)
  • [23] Cardiovascular disease classification using Convolution Neural Network based on deep learning
    Shon, H. S.
    Kim, K. O.
    Cha, E.
    Kim, K.
    FEBS OPEN BIO, 2019, 9 : 106 - 106
  • [24] A Neural Network Based Approach for Recognition of Basic Emotions from Speech
    Sham-E-Ansari, Md
    Disha, Shaminaj Towfika
    Chowdhury, Atiqul Islam
    Hasan, Md Khairul
    2020 IEEE REGION 10 SYMPOSIUM (TENSYMP) - TECHNOLOGY FOR IMPACTFUL SUSTAINABLE DEVELOPMENT, 2020, : 807 - 810
  • [25] A deep transfer learning based convolution neural network framework for air temperature classification using human clothing images
    Ahmed, Maqsood
    Zhang, Xiang
    Shen, Yonglin
    Ali, Nafees
    Flah, Aymen
    Kanan, Mohammad
    Alsharef, Mohammad
    Ghoneim, Sherif S. M.
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [26] ITL-CNN: Integrated Transfer Learning-Based Convolution Neural Network for Ultrasound PCOS Image Classification
    Gopalakrishnan, C.
    Iyapparaja, M.
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (16)
  • [27] An ensemble transfer learning-based deep convolution neural network for the detection and classification of diseased cotton leaves and plants
    Rai C.K.
    Pahuja R.
    Multimedia Tools and Applications, 2024, 83 (36) : 83991 - 84024
  • [28] Neural Entrainment to Natural Speech Envelope Based on Subject Aligned EEG Signals
    Zhou, Di
    Zhang, Gaoyan
    Dang, Jianwu
    Wu, Shuang
    Zhang, Zhuo
    INTERSPEECH 2020, 2020, : 106 - 110
  • [29] Lung cancer histopathology image classification using transfer learning with convolution neural network model
    Muniasamy, Anandhavalli
    Alquhtani, Salma Abdulaziz Saeed
    Bilfaqih, Syeda Meraj
    Balaji, Prasanalakshmi
    Karunakaran, Gauthaman
    TECHNOLOGY AND HEALTH CARE, 2024, 32 (02) : 1199 - 1210
  • [30] FCNet: Flower Classification Using Custom-Made Convolution Neural Network and Transfer Learning
    Vardiyani, Roma
    Sahu, Satya Prakash
    PROCEEDINGS OF EMERGING TRENDS AND TECHNOLOGIES ON INTELLIGENT SYSTEMS (ETTIS 2021), 2022, 1371 : 115 - 125