Research on Speech Emotional Feature Extraction Based on Multidimensional Feature Fusion

被引:2
|
作者
Zheng, Chunjun [1 ,2 ]
Wang, Chunli [1 ]
Sun, Wei [2 ]
Jia, Ning [2 ]
机构
[1] Dalian Maritime Univ, Dalian, Liaoning, Peoples R China
[2] Dalian Neusoft Univ Informat, Dalian, Liaoning, Peoples R China
来源
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019 | 2019年 / 11888卷
关键词
Low-Level Acoustic Descriptors; Convolutional Recurrent Neural Network; Feature Fusion; Speech emotion recognition; RECOGNITION;
D O I
10.1007/978-3-030-35231-8_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the field of speech processing, speech emotion recognition is a challenging task with broad application prospects. Since the effective speech feature set directly affects the accuracy of speech emotion recognition, the research on effective features is one of the key issues in speech emotion recognition. Emotional expression and individualized features are often related, so it is often difficult to find generalized effective speech features, which is one of the main research contents of this paper. It is necessary to generate a general emotional feature representation in the speech signal from the perspective of local features and global features: (1) Using the spectrogram and Convolutional Recurrent Neural Network (CRNN) to construct the speech emotion recognition model, which can effectively learn to represent the spatial characteristics of the emotional information and to obtain the aggravated local feature information. (2) Using Low-Level acoustic Descriptors (LLD), through a large number of experiments, the feature representations of limited dimensions such as energy, fundamental frequency, spectrum and statistical features based on these low-level features are screened to obtain the global feature description. (3) Combining the previous features, and verifying the performance of various features in emotion recognition on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) emotional corpus, the accuracy and representativeness of the features obtained in this paper are verified.
引用
收藏
页码:535 / 547
页数:13
相关论文
共 50 条
  • [31] Bearing Fault Feature Extraction and Fault Diagnosis Method Based on Feature Fusion
    Zhu, Huibin
    He, Zhangming
    Wei, Juhui
    Wang, Jiongqi
    Zhou, Haiyin
    SENSORS, 2021, 21 (07)
  • [32] Speech Emotion Recognition based on Multiple Feature Fusion
    Jiang, Changjiang
    Mao, Rong
    Liu, Geng
    Wang, Mingyi
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 907 - 912
  • [33] Feature Extraction Using Fusion MFCC For Continuous Marathi Speech Recognition
    Gaikwad, Santosh
    Gawali, Bharti
    Yannawar, Pravin
    Mehrotra, Suresh
    2011 ANNUAL IEEE INDIA CONFERENCE (INDICON-2011): ENGINEERING SUSTAINABLE SOLUTIONS, 2011,
  • [35] Pollination Based Optimization for Feature Reduction at Feature level Fusion of Speech & Signature Biometrics
    Kaur, Gaganpreet
    Singh, Dheerendra
    Kaur, Sukhpreet
    2014 3RD INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (ICRITO) (TRENDS AND FUTURE DIRECTIONS), 2014,
  • [36] Research on the Continuous Speech Feature Extraction Method for Different Noise
    Liu Wei
    Sun Yiming
    Liu Yanxiu
    APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 3589 - 3592
  • [37] Acceleration of feature extraction for FPGA based speech recognition
    Arminas, Vytautas
    Tamulevicius, Gintautas
    Navakauskas, Dalius
    Ivanovas, Edgaras
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2010, 2010, 7745
  • [38] Speech recognition with emphasis on wavelet based feature extraction
    Farooq, O
    Datta, S
    IETE JOURNAL OF RESEARCH, 2002, 48 (01) : 3 - 13
  • [39] Research on the Construction of Emergency Network Public Opinion Emotional Dictionary Based on Emotional Feature Extraction Algorithm
    Hui, Fang
    FRONTIERS IN PSYCHOLOGY, 2022, 13
  • [40] MVDR based feature extraction for robust speech recognition
    Dharanipragada, S
    Rao, BD
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 309 - 312