Transfer learning based convolution neural net for authentication and classification of emotions from natural and stimulated speech signals

被引:4
|
作者
Kumar, Mukul [1 ]
Katyal, Nipun [1 ]
Ruban, Nersisson [1 ]
Lyakso, Elena [2 ]
Mekala, A. Mary [3 ]
Raj, Alex Noel Joseph [4 ]
Richard, G. Maarc [1 ]
机构
[1] Vellore Inst Technol Vellore, Sch Elect Engn, Vellore, Tamil Nadu, India
[2] St Petersburg State Univ, St Petersburg, Russia
[3] Vellore Inst Technol Vellore, Sch Informat Technol & Engn, Vellore, Tamil Nadu, India
[4] Shantou Univ, Coll Engn, Dept Elect Engn, Key Lab Digital Signal & Image Proc Guangdong Pro, Shantou, Peoples R China
关键词
Deep learning; speech fidelity classification; linear prediction cepstral coefficients (LPCC); mel frequency cepstral coefficients (MFCC); speech emotion recognition; RECOGNITION; FREQUENCY;
D O I
10.3233/JIFS-210711
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the years the need for differentiating various emotions from oral communication plays an important role in emotion based studies. There have been different algorithms to classify the kinds of emotion. Although there is no measure of fidelity of the emotion under consideration, which is primarily due to the reason that most of the readily available datasets that are annotated are produced by actors and not generated in real-world scenarios. Therefore, the predicted emotion lacks an important aspect called authenticity, which is whether an emotion is actual or stimulated. In this research work, we have developed a transfer learning and style transfer based hybrid convolutional neural network algorithm to classify the emotion as well as the fidelity of the emotion. The model is trained on features extracted from a dataset that contains stimulated as well as actual utterances. We have compared the developed algorithm with conventional machine learning and deep learning techniques by few metrics like accuracy, Precision, Recall and F1score. The developed model performs much better than the conventional machine learning and deep learning models. The research aims to dive deeper into human emotion and make a model that understands it like humans do with precision, recall, Fl score values of 0.994, 0.996, 0.995 for speech authenticity and 0.992, 0.989, 0.99 for speech emotion classification respectively.
引用
收藏
页码:2013 / 2024
页数:12
相关论文
共 50 条
  • [31] Quaternary classification of emotions based on electroencephalogram signals using hybrid deep learning model
    Singh K.
    Ahirwal M.K.
    Pandey M.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (03) : 2429 - 2441
  • [32] Convolution neural network SAR image target recognition based on transfer learning
    Chen Lifu
    Wu Hong
    Cui Xianliang
    Guo Zhenghua
    Jia Zhiwei
    CHINESE SPACE SCIENCE AND TECHNOLOGY, 2018, 38 (06) : 45 - 51
  • [33] SAR Targets Classification Based on Deep Memory Convolution Neural Networks and Transfer Parameters
    Shang, Ronghua
    Wang, Jiaming
    Jiao, Licheng
    Stolkin, Rustam
    Hou, Biao
    Li, Yangyang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2018, 11 (08) : 2834 - 2846
  • [34] Machine learning technique-based emotion classification using speech signals
    K. Ashok Kumar
    J. L. Mazher Iqbal
    Soft Computing, 2023, 27 : 8331 - 8343
  • [35] Machine learning technique-based emotion classification using speech signals
    Kumar, K. Ashok
    Iqbal, J. L. Mazher
    SOFT COMPUTING, 2023, 27 (12) : 8331 - 8343
  • [36] Image Classification Based on transfer Learning of Convolutional neural network
    Wang, Yunyan
    Wang, Chongyang
    Luo, Lengkun
    Zhou, Zhigang
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 7506 - 7510
  • [37] Feedforward Neural Network-Based Architecture for Predicting Emotions from Speech
    Gavrilescu, Mihai
    Vizireanu, Nicolae
    DATA, 2019, 4 (03)
  • [38] Speech extraction from vibration signals based on deep learning
    Wang, Li
    Zheng, Weiguang
    Li, Shande
    Huang, Qibai
    PLOS ONE, 2023, 18 (10):
  • [39] Deep Convolution Neural Network and Autoencoders-Based Unsupervised Feature Learning of eeg Signals
    Wen, Tingxi
    Zhang, Zhongnan
    IEEE ACCESS, 2018, 6 : 25399 - 25410
  • [40] Deep-CNNTL: Text Localization from Natural Scene Images Using Deep Convolution Neural Network with Transfer Learning
    Y. L. Chaitra
    R. Dinesh
    M. T. Gopalakrishna
    B. V. Ajay Prakash
    Arabian Journal for Science and Engineering, 2022, 47 : 9629 - 9640