A lightweight 2D CNN based approach for speaker-independent emotion recognition from speech with new Indian Emotional Speech Corpora

被引:0
|
作者
Youddha Beer Singh
Shivani Goel
机构
[1] Bennett University,School of Computer Science Engineering and Technology
[2] KIET Group of Institutions,Department of Computer Science and Information Technology
[3] Delhi-NCR,undefined
来源
关键词
Convolutional Neural Network; Indian Emotional Speech Corpora; Spectrogram; Speech Emotion Recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Speech Emotion Recognition (SER) is the process of recognizing emotions by extracting few features of speech signals. It is becoming very popular in Human Computer Interaction (HCI) applications. The challenge is to extract relevant features of speech to recognize emotions with a low computational cost. In this paper, a lightweight Convolutional Neural Network (LCNN) based model has been proposed which extracts useful features automatically. The speech samples are converted into spectrograms of size 224 × 224 for LCNN input. 5 CNN layers and stride are used for down-sampling the feature maps in place of pooling layers which reduces the computational cost. It has been evaluated for accuracy on publicly available benchmark datasets EMOVO (81%), EMODB (87%), and SAVEE (80%). The accuracy of proposed model is also found to be better than SER CNN-assisted model, ResNet-18 and ResNet-34 models. Very few speech datasets are available in Indian ascent. So, authors have created a new Indian Emotional Speech Corpora (IESC) in English language with 600 speech samples recorded from 8 speakers using 2 sentences in 5 emotions. It will be made publicly available for researchers. The accuracy of the proposed LCNN model on IESC is found to be 95% which is better than existing datasets.
引用
收藏
页码:23055 / 23073
页数:18
相关论文
共 50 条
  • [1] A lightweight 2D CNN based approach for speaker-independent emotion recognition from speech with new Indian Emotional Speech Corpora
    Singh, Youddha Beer
    Goel, Shivani
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (15) : 23055 - 23073
  • [2] Speaker-Independent Speech Emotion Recognition Based on CNN-BLSTM and Multiple SVMs
    Liu, Zhen-Tao
    Xiao, Peng
    Li, Dan-Yun
    Hao, Man
    [J]. INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT III, 2019, 11742 : 481 - 491
  • [3] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Md Shah Fahad
    Ashish Ranjan
    Akshay Deepak
    Gayadhar Pradhan
    [J]. Circuits, Systems, and Signal Processing, 2022, 41 : 6113 - 6135
  • [4] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Fahad, Md Shah
    Ranjan, Ashish
    Deepak, Akshay
    Pradhan, Gayadhar
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (11) : 6113 - 6135
  • [6] Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition
    Lu, Cheng
    Zong, Yuan
    Zheng, Wenming
    Li, Yang
    Tang, Chuangao
    Schuller, Bjoern W.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2217 - 2230
  • [7] Speaker-Independent Speech Emotion Recognition Based Multiple Kernel Learning of Collaborative Representation
    Zha, Cheng
    Zhang, Xinrang
    Zhao, Li
    Liang, Ruiyu
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (03) : 756 - 759
  • [8] Speaker-independent Speech Emotion Recognition Based on Random Forest Feature Selection Algorithm
    Cao, Wei-Hua
    Xu, Jian-Ping
    Liu, Zhen-Tao
    [J]. PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 10995 - 10998
  • [9] Study on Speaker-Independent Emotion Recognition from Speech on Real-World Data
    Kostoulas, Theodoros
    Ganchev, Todor
    Fakotakis, Nikos
    [J]. VERBAL AND NONVERBAL FEATURES OF HUMAN-HUMAN AND HUMAN-MACHINE INTERACTIONS, 2008, 5042 : 235 - 242
  • [10] Text Independent Speaker and Emotion Independent Speech Recognition in Emotional Environment
    Revathi, A.
    Venkataramani, Y.
    [J]. INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, 2015, 339 : 43 - 52