A lightweight 2D CNN based approach for speaker-independent emotion recognition from speech with new Indian Emotional Speech Corpora

被引：0

作者：

Youddha Beer Singh

Shivani Goel

机构：

[1] Bennett University,School of Computer Science Engineering and Technology

[2] KIET Group of Institutions,Department of Computer Science and Information Technology

[3] Delhi-NCR,undefined

来源：

Multimedia Tools and Applications | 2023年 / 82卷

关键词：

Convolutional Neural Network; Indian Emotional Speech Corpora; Spectrogram; Speech Emotion Recognition;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Speech Emotion Recognition (SER) is the process of recognizing emotions by extracting few features of speech signals. It is becoming very popular in Human Computer Interaction (HCI) applications. The challenge is to extract relevant features of speech to recognize emotions with a low computational cost. In this paper, a lightweight Convolutional Neural Network (LCNN) based model has been proposed which extracts useful features automatically. The speech samples are converted into spectrograms of size 224 × 224 for LCNN input. 5 CNN layers and stride are used for down-sampling the feature maps in place of pooling layers which reduces the computational cost. It has been evaluated for accuracy on publicly available benchmark datasets EMOVO (81%), EMODB (87%), and SAVEE (80%). The accuracy of proposed model is also found to be better than SER CNN-assisted model, ResNet-18 and ResNet-34 models. Very few speech datasets are available in Indian ascent. So, authors have created a new Indian Emotional Speech Corpora (IESC) in English language with 600 speech samples recorded from 8 speakers using 2 sentences in 5 emotions. It will be made publicly available for researchers. The accuracy of the proposed LCNN model on IESC is found to be 95% which is better than existing datasets.

引用

页码：23055 / 23073

页数：18

共 50 条

[1] A lightweight 2D CNN based approach for speaker-independent emotion recognition from speech with new Indian Emotional Speech Corpora
Singh, Youddha Beer
Goel, Shivani
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (15) : 23055 - 23073
[2] Speaker-Independent Speech Emotion Recognition Based on CNN-BLSTM and Multiple SVMs
Liu, Zhen-Tao
Xiao, Peng
Li, Dan-Yun
Hao, Man
[J]. INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT III, 2019, 11742 : 481 - 491
[3] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
Md Shah Fahad
Ashish Ranjan
Akshay Deepak
Gayadhar Pradhan
[J]. Circuits, Systems, and Signal Processing, 2022, 41 : 6113 - 6135
[4] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
Fahad, Md Shah
Ranjan, Ashish
Deepak, Akshay
Pradhan, Gayadhar
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (11) : 6113 - 6135
[5] Uighur speaker-independent speech recognition based on CDCPM
[J]. 2001, Science Press (38):
[6] Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition
Lu, Cheng
Zong, Yuan
Zheng, Wenming
Li, Yang
Tang, Chuangao
Schuller, Bjoern W.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2217 - 2230
[7] Speaker-Independent Speech Emotion Recognition Based Multiple Kernel Learning of Collaborative Representation
Zha, Cheng
Zhang, Xinrang
Zhao, Li
Liang, Ruiyu
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (03) : 756 - 759
[8] Speaker-independent Speech Emotion Recognition Based on Random Forest Feature Selection Algorithm
Cao, Wei-Hua
Xu, Jian-Ping
Liu, Zhen-Tao
[J]. PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 10995 - 10998
[9] Study on Speaker-Independent Emotion Recognition from Speech on Real-World Data
Kostoulas, Theodoros
Ganchev, Todor
Fakotakis, Nikos
[J]. VERBAL AND NONVERBAL FEATURES OF HUMAN-HUMAN AND HUMAN-MACHINE INTERACTIONS, 2008, 5042 : 235 - 242
[10] Text Independent Speaker and Emotion Independent Speech Recognition in Emotional Environment
Revathi, A.
Venkataramani, Y.
[J]. INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, 2015, 339 : 43 - 52

← 1 2 3 4 5 →