Improving the performance of the speaker emotion recognition based on low dimension prosody features vector

被引：6

作者：

Gudmalwar, Ashishkumar Prabhakar ^{[1
]}

Rao, Ch V. Rama ^{[1
]}

Dutta, Anirban ^{[1
]}

机构：

[1] Natl Inst Technol, Shillong, Meghalaya, India

来源：

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY | 2019年 / 22卷 / 03期

关键词：

Prosody; PCA; Emotion recognition; Recognition rate; SPEECH;

D O I：

10.1007/s10772-018-09576-4

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Speaker emotion recognition is an important research issue as it finds lots of applications in human-robot interaction, computer-human interaction, etc. This work deals with the recognition of emotion of the speaker from speech utterance. For that features like pitch, log energy, zero crossing rate, and first three formant frequencies are used. Feature vectors are constructed using the 11 statistical parameters of each feature. The Artificial Neural Network (ANN) is chosen as a classifier owing to its universal function approximation capabilities. In ANN based classifier, the time required for training the network as well as for classification depends upon the dimension of feature vector. This work focused on development of a speaker emotion recognition system using prosody features as well as reduction of dimensionality of feature vectors. Here, principle component analysis (PCA) is used for feature vector dimensionality reduction. Emotional prosody speech and transcription from Linguistic Data Consortium (LDC) and Berlin emotional databases are considered for evaluating the performance of proposed approach for seven types of emotion recognition. The performance of the proposed method is compared with existing approaches and better performance is obtained with proposed method. From experimental results it is observed that 75.32% and 84.5% recognition rate is obtained for Berlin emotional database and LDC emotional speech database respectively.

引用

页码：521 / 531

页数：11

共 50 条

[31] New Adaptive Feature Vector Construction Procedure for Speaker Emotion Recognition Based on Wavelet Transform and Genetic Algorithm
Soroka, Alexander M.
Kovalets, Pavel E.
Kheidorov, Igor E.
ADVANCES IN NEURAL NETWORKS - ISNN 2016, 2016, 9719 : 613 - 619
[32] Graph Learning Based Speaker Independent Speech Emotion Recognition
Xu, Xinzhou
Huang, Chengwei
Wu, Chen
Wang, Qingyun
Zhao, Li
ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2014, 14 (02) : 17 - 22
[33] DNN-HMM-Based Speaker-Adaptive Emotion Recognition Using MFCC and Epoch-Based Features
Fahad, Md. Shah
Deepak, Akshay
Pradhan, Gayadhar
Yadav, Jainath
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2021, 40 (01) : 466 - 489
[34] DNN-HMM-Based Speaker-Adaptive Emotion Recognition Using MFCC and Epoch-Based Features
Md. Shah Fahad
Akshay Deepak
Gayadhar Pradhan
Jainath Yadav
Circuits, Systems, and Signal Processing, 2021, 40 : 466 - 489
[35] Fractal dimension pattern-based multiresolution analysis for rough estimator of speaker-dependent audio emotion recognition
Cheng, Miao
Tsoi, Ah Chung
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2017, 15 (05)
[36] Improving Emotion Recognition Performance by Random-Forest-Based Feature Selection
Egorow, Olga
Siegert, Ingo
Wendemuth, Andreas
SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 134 - 144
[37] EEG-Based Emotion Recognition Using Frequency Domain Features and Support Vector Machines
Wang, Xiao-Wei
Nie, Dan
Lu, Bao-Liang
NEURAL INFORMATION PROCESSING, PT I, 2011, 7062 : 734 - +
[38] Improving Speech Emotion Recognition via Fine-tuning ASR with Speaker Information
Ta, Bao Thang
Nguyen, Tung Lam
Dang, Dinh Son
Le, Nhat Minh
Do, Van Hai
PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1596 - 1601
[39] Robust speaker recognition based on biologically inspired features
Zouhir, Youssef
Ben Fredj, Ines
Ouni, Kais
Zarka, Mohamed
INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2020, 12 (1-2) : 19 - 27
[40] Filter bank Based Cepstral Features for Speaker Recognition
Chougule, Sharada V.
Chavan, Mahesh S.
Gaikwad, M. S.
2014 IEEE GLOBAL CONFERENCE ON WIRELESS COMPUTING AND NETWORKING (GCWCN), 2014, : 102 - 106

← 1 2 3 4 5 →