Research on Speech Emotional Feature Extraction Based on Multidimensional Feature Fusion

被引：2

作者：

Zheng, Chunjun ^{[1
,2
]}

Wang, Chunli ^{[1
]}

Sun, Wei ^{[2
]}

Jia, Ning ^{[2
]}

机构：

[1] Dalian Maritime Univ, Dalian, Liaoning, Peoples R China

[2] Dalian Neusoft Univ Informat, Dalian, Liaoning, Peoples R China

来源：

ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019 | 2019年 / 11888卷

关键词：

Low-Level Acoustic Descriptors; Convolutional Recurrent Neural Network; Feature Fusion; Speech emotion recognition; RECOGNITION;

D O I：

10.1007/978-3-030-35231-8_39

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the field of speech processing, speech emotion recognition is a challenging task with broad application prospects. Since the effective speech feature set directly affects the accuracy of speech emotion recognition, the research on effective features is one of the key issues in speech emotion recognition. Emotional expression and individualized features are often related, so it is often difficult to find generalized effective speech features, which is one of the main research contents of this paper. It is necessary to generate a general emotional feature representation in the speech signal from the perspective of local features and global features: (1) Using the spectrogram and Convolutional Recurrent Neural Network (CRNN) to construct the speech emotion recognition model, which can effectively learn to represent the spatial characteristics of the emotional information and to obtain the aggravated local feature information. (2) Using Low-Level acoustic Descriptors (LLD), through a large number of experiments, the feature representations of limited dimensions such as energy, fundamental frequency, spectrum and statistical features based on these low-level features are screened to obtain the global feature description. (3) Combining the previous features, and verifying the performance of various features in emotion recognition on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) emotional corpus, the accuracy and representativeness of the features obtained in this paper are verified.

引用

页码：535 / 547

页数：13

共 50 条

[31] Bearing Fault Feature Extraction and Fault Diagnosis Method Based on Feature Fusion
Zhu, Huibin
He, Zhangming
Wei, Juhui
Wang, Jiongqi
Zhou, Haiyin
SENSORS, 2021, 21 (07)
[32] Speech Emotion Recognition based on Multiple Feature Fusion
Jiang, Changjiang
Mao, Rong
Liu, Geng
Wang, Mingyi
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 907 - 912
[33] Feature Extraction Using Fusion MFCC For Continuous Marathi Speech Recognition
Gaikwad, Santosh
Gawali, Bharti
Yannawar, Pravin
Mehrotra, Suresh
2011 ANNUAL IEEE INDIA CONFERENCE (INDICON-2011): ENGINEERING SUSTAINABLE SOLUTIONS, 2011,
[34] IR target feature analysis and fusion feature extraction
Wang, J.-A., 2005, Guojia Jiaowei Quanguo Gaoxiao Chuangan Jishu Yanjiuhui (18)
[35] Pollination Based Optimization for Feature Reduction at Feature level Fusion of Speech & Signature Biometrics
Kaur, Gaganpreet
Singh, Dheerendra
Kaur, Sukhpreet
2014 3RD INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (ICRITO) (TRENDS AND FUTURE DIRECTIONS), 2014,
[36] Research on the Continuous Speech Feature Extraction Method for Different Noise
Liu Wei
Sun Yiming
Liu Yanxiu
APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 3589 - 3592
[37] Acceleration of feature extraction for FPGA based speech recognition
Arminas, Vytautas
Tamulevicius, Gintautas
Navakauskas, Dalius
Ivanovas, Edgaras
PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2010, 2010, 7745
[38] Speech recognition with emphasis on wavelet based feature extraction
Farooq, O
Datta, S
IETE JOURNAL OF RESEARCH, 2002, 48 (01) : 3 - 13
[39] Research on the Construction of Emergency Network Public Opinion Emotional Dictionary Based on Emotional Feature Extraction Algorithm
Hui, Fang
FRONTIERS IN PSYCHOLOGY, 2022, 13
[40] MVDR based feature extraction for robust speech recognition
Dharanipragada, S
Rao, BD
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 309 - 312

← 1 2 3 4 5 →