Transfer learning based convolution neural net for authentication and classification of emotions from natural and stimulated speech signals

被引:4
|
作者
Kumar, Mukul [1 ]
Katyal, Nipun [1 ]
Ruban, Nersisson [1 ]
Lyakso, Elena [2 ]
Mekala, A. Mary [3 ]
Raj, Alex Noel Joseph [4 ]
Richard, G. Maarc [1 ]
机构
[1] Vellore Inst Technol Vellore, Sch Elect Engn, Vellore, Tamil Nadu, India
[2] St Petersburg State Univ, St Petersburg, Russia
[3] Vellore Inst Technol Vellore, Sch Informat Technol & Engn, Vellore, Tamil Nadu, India
[4] Shantou Univ, Coll Engn, Dept Elect Engn, Key Lab Digital Signal & Image Proc Guangdong Pro, Shantou, Peoples R China
关键词
Deep learning; speech fidelity classification; linear prediction cepstral coefficients (LPCC); mel frequency cepstral coefficients (MFCC); speech emotion recognition; RECOGNITION; FREQUENCY;
D O I
10.3233/JIFS-210711
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the years the need for differentiating various emotions from oral communication plays an important role in emotion based studies. There have been different algorithms to classify the kinds of emotion. Although there is no measure of fidelity of the emotion under consideration, which is primarily due to the reason that most of the readily available datasets that are annotated are produced by actors and not generated in real-world scenarios. Therefore, the predicted emotion lacks an important aspect called authenticity, which is whether an emotion is actual or stimulated. In this research work, we have developed a transfer learning and style transfer based hybrid convolutional neural network algorithm to classify the emotion as well as the fidelity of the emotion. The model is trained on features extracted from a dataset that contains stimulated as well as actual utterances. We have compared the developed algorithm with conventional machine learning and deep learning techniques by few metrics like accuracy, Precision, Recall and F1score. The developed model performs much better than the conventional machine learning and deep learning models. The research aims to dive deeper into human emotion and make a model that understands it like humans do with precision, recall, Fl score values of 0.994, 0.996, 0.995 for speech authenticity and 0.992, 0.989, 0.99 for speech emotion classification respectively.
引用
收藏
页码:2013 / 2024
页数:12
相关论文
共 50 条
  • [41] Deep-CNNTL: Text Localization from Natural Scene Images Using Deep Convolution Neural Network with Transfer Learning
    Chaitra, Y. L.
    Dinesh, R.
    Gopalakrishna, M. T.
    Prakash, B. V. Ajay
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 9629 - 9640
  • [42] Application of Convolution Neural Network Based on Transfer Learning in Sandstorm Prediction in Inner Mongolia
    Ren Qing-Dao-Er-Ji
    Qiu Ying
    Li Tiancheng
    2020 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2020), 2020, : 120 - 124
  • [43] A Bearing Fault Diagnosis Method Based on Improved Convolution Neural Network and Transfer Learning
    Jiang, Fan
    Shen, Xi
    Jiang, Feng
    Zhao, ZiShan
    Cheng, ShuMan
    INTERNATIONAL CONFERENCE ON INTELLIGENT EQUIPMENT AND SPECIAL ROBOTS (ICIESR 2021), 2021, 12127
  • [44] Eye state detection based on Weight Binarization Convolution Neural Network and Transfer Learning
    Liu, Zhen-Tao
    Jiang, Cheng-Shan
    Li, Si-Han
    Wu, Min
    Cao, Wei-Hua
    Hao, Man
    APPLIED SOFT COMPUTING, 2021, 109
  • [45] Transfer Learning-Based Convolution Neural Network Model for Hand Gesture Recognition
    Kumari, Niranjali
    Joshi, Garima
    Kaur, Satwinder
    Vig, Renu
    THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 827 - 840
  • [46] Classification of group speech imagined EEG signals based on attention mechanism and deep learning
    Zhou, Yifan
    Zhang, Lingwei
    Zhou, Zhengdong
    Cai, Zhi
    Yuan, Mengyao
    Yuan, Xiaoxi
    Yang, Zeyi
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (12): : 2540 - 2546
  • [47] Multiple Kernel Based Region Importance Learning for Neural Classification of Gait States from EEG Signals
    Zhang, Yuhang
    Prasad, Saurabh
    Kilicarslan, Atilla
    Contreras-Vidal, Jose L.
    FRONTIERS IN NEUROSCIENCE, 2017, 11
  • [48] Arabic Speech Classification Method Based on Padding and Deep Learning Neural Network
    Asroni, Asroni
    Ku-Mahamud, Ku Ruhana
    Damarjati, Cahya
    Slamat, Hasan Basri
    BAGHDAD SCIENCE JOURNAL, 2021, 18 (02) : 925 - 936
  • [49] Transfer Learning and Hybrid Deep Convolutional Neural Networks Models for Autism Spectrum Disorder Classification From EEG Signals
    Al-Qazzaz, Noor Kamal
    Aldoori, Alaa A.
    Buniya, Ali K.
    Ali, Sawal Hamid Bin Mohd
    Ahmad, Siti Anom
    IEEE ACCESS, 2024, 12 : 64510 - 64530
  • [50] Speech Emotion Recognition Using Deep Neural Networks, Transfer Learning, and Ensemble Classification Techniques
    Mihalache, Serban
    Burileanu, Dragos
    ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY, 2023, 26 (3-4): : 375 - 387