Speech Emotion Recognition via Multi-Level Attention Network

被引:8
|
作者
Liu, Ke [1 ]
Wang, Dekui [1 ]
Wu, Dongya [1 ]
Liu, Yutao [1 ]
Feng, Jun [1 ]
机构
[1] Northwest Univ, Sch Informat Sci & Technol, Xian 710127, Peoples R China
基金
中国国家自然科学基金;
关键词
MFCC; multi-scale feature; attention mechanism; speech emotion recognition; NEURAL-NETWORK; FEATURES;
D O I
10.1109/LSP.2022.3219352
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Aiming to improve the performance of human speech emotion recognition (SER), the existing work has made great progress based on the popular mel-scale frequency cepstral coefficient (MFCC). However, the existing work rarely pays attention to the low-level emotion related features in MFCC, such as the underlying interactive relations. In this letter, we propose a novel multi-level attention network (MLAnet), which contains a multi-scale low-level feature (MLF) extractor and a multi-unit attention (MUA) module. Within the MLF extractor, we minimize the task-irrelevant information which harms the performance of SER by applying the attention mechanism. Since the features extracted by the MLF extractor contain rich domain-specific emotion information, we further present a MUA module to simultaneously weight the features in terms of time, frequency and channel dimensions. In this way, the discriminative emotion features in different dimensions can be extracted by corresponding weighting blocks. Experimental results on two benchmark datasets demonstrate that the proposed method outperforms other state-of-the-art approaches.
引用
收藏
页码:2278 / 2282
页数:5
相关论文
共 50 条
  • [31] Multi-Level Ensemble Network for Scene Recognition
    Longhao Zhang
    Lingqiao Li
    Xipeng Pan
    Zhiwei Cao
    Qianyu Chen
    Huihua Yang
    [J]. Multimedia Tools and Applications, 2019, 78 : 28209 - 28230
  • [32] Deep Learning-Based Speech Emotion Recognition Using Multi-Level Fusion of Concurrent Features
    Kakuba, Samuel
    Poulose, Alwin
    Han, Dong Seog
    [J]. IEEE ACCESS, 2022, 10 : 125538 - 125551
  • [33] Multi-Level Attention-Based Categorical Emotion Recognition Using Modulation-Filtered Cochleagram
    Peng, Zhichao
    He, Wenhua
    Li, Yongwei
    Du, Yegang
    Dang, Jianwu
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (11):
  • [34] Multi-level Residual Attention Network for Speckle Suppression
    Lei, Yu
    Liu, Shuaiqi
    Zhang, Luyao
    Zhao, Ling
    Zhao, Jie
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 288 - 299
  • [35] Multi-Level Attention Network for Retinal Vessel Segmentation
    Yuan, Yuchen
    Zhang, Lei
    Wang, Lituan
    Huang, Haiying
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (01) : 312 - 323
  • [36] Relation Classification via Multi-Level Attention CNNs
    Wang, Linlin
    Cao, Zhu
    de Melo, Gerard
    Liu, Zhiyuan
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1298 - 1307
  • [37] Temporal Attention Convolutional Network for Speech Emotion Recognition with Latent Representation
    Liu, Jiaxing
    Liu, Zhilei
    Wang, Longbiao
    Gao, Yuan
    Guo, Lili
    Dang, Jianwu
    [J]. INTERSPEECH 2020, 2020, : 2337 - 2341
  • [38] Infrared image denoising via adversarial learning with multi-level feature attention network
    Yang, Pengfei
    Wu, Heng
    Cheng, Lianglun
    Luo, Shaojuan
    [J]. INFRARED PHYSICS & TECHNOLOGY, 2023, 128
  • [39] The multi-level classification and regression network for visual tracking via residual channel attention
    Yu, Junyang
    Zuo, Mengle
    Dong, Lifeng
    Zhang, Huanlong
    He, Xin
    [J]. DIGITAL SIGNAL PROCESSING, 2022, 120
  • [40] Attention gated tensor neural network architectures for speech emotion recognition
    Pandey, Sandeep Kumar
    Shekhawat, Hanumant Singh
    Prasanna, S. R. M.
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 71