Deep and shallow features fusion based on deep convolutional neural network for speech emotion recognition

被引:34
|
作者
Sun L. [1 ,2 ]
Chen J. [1 ]
Xie K. [1 ]
Gu T. [1 ]
机构
[1] College of Telecommunications & Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing
[2] Key Lab of Broadband Wireless Communication and Sensor Network Technology, Ministry of Education, Nanjing University of Posts and Telecommunications, Nanjing
基金
中国国家自然科学基金;
关键词
Deep and shallow feature fusion; Deep convolutional neutral network; Speech emotion recognition;
D O I
10.1007/s10772-018-9551-4
中图分类号
学科分类号
摘要
Recent years have witnessed the great progress for speech emotion recognition using deep convolutional neural networks (DCNNs). In order to improve the performance of speech emotion recognition, a novel feature fusion method is proposed. With going deeper of the convolutional layers, the convolutional feature of traditional DCNNs gradually become more abstract, which may not be the best feature for speech emotion recognition. On the other hand, the shallow feature includes only global information without the detailed information extracted by deeper convolutional layers. According to these observations, we design a deep and shallow feature fusion convolutional network, which combines the feature from different levels of network for speech emotion recognition. The proposed network allows us to fully exploit deep and shallow feature. The popular Berlin data set is used in our experiments, the experimental results show that our proposed network can further improve speech emotion recognition rate which demonstrates the effectiveness of the proposed network. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:931 / 940
页数:9
相关论文
共 50 条
  • [21] Deep Convolutional Neural Network for Arabic Speech Recognition
    Amari, Rafik
    Noubigh, Zouhaira
    Zrigui, Salah
    Berchech, Dhaou
    Nicolas, Henri
    Zrigui, Mounir
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
  • [22] An efficient deep convolutional neural network with features fusion for radar signal recognition
    Si, Weijian
    Wan, Chenxia
    Deng, Zhian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (02) : 2871 - 2885
  • [23] An efficient deep convolutional neural network with features fusion for radar signal recognition
    Weijian Si
    Chenxia Wan
    Zhian Deng
    Multimedia Tools and Applications, 2023, 82 : 2871 - 2885
  • [24] Towards an efficient backbone for preserving features in speech emotion recognition: deep-shallow convolution with recurrent neural network
    Goel, Dev Priya
    Mahajan, Kushagra
    Ngoc Duy Nguyen
    Srinivasan, Natesan
    Lim, Chee Peng
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (03): : 2457 - 2469
  • [25] Towards an efficient backbone for preserving features in speech emotion recognition: deep-shallow convolution with recurrent neural network
    Dev Priya Goel
    Kushagra Mahajan
    Ngoc Duy Nguyen
    Natesan Srinivasan
    Chee Peng Lim
    Neural Computing and Applications, 2023, 35 : 2457 - 2469
  • [26] Deep convolutional neural network architecture for facial emotion recognition
    Pruthviraja, Dayananda
    Kumar, Ujjwal Mohan
    Parameswaran, Sunil
    Chowdary, Vemulapalli Guna
    Bharadwaj, Varun
    PEERJ COMPUTER SCIENCE, 2024, 10 : 1 - 20
  • [27] Facial Emotion Recognition Using Deep Convolutional Neural Network
    Pranav, E.
    Kamal, Suraj
    Chandran, Satheesh C.
    Supriya, M. H.
    2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 317 - 320
  • [28] Deep Convolutional Neural Networks for Feature Extraction in Speech Emotion Recognition
    Heracleous, Panikos
    Mohammad, Yasser
    Yoneyama, Akio
    HUMAN-COMPUTER INTERACTION. RECOGNITION AND INTERACTION TECHNOLOGIES, HCI 2019, PT II, 2019, 11567 : 117 - 132
  • [29] Enhancing Speech Emotion Recognition Using Deep Convolutional Neural Networks
    Islam, M. M. Manjurul
    Kabir, Md Alamgir
    Sheikh, Alamin
    Saiduzzaman, Muhammad
    Hafid, Abdelakram
    Abdullah, Saad
    PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING TECHNOLOGIES, ICMLT 2024, 2024, : 95 - 100
  • [30] Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
    Zhang, Shiqing
    Zhang, Shiliang
    Huang, Tiejun
    Gao, Wen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (06) : 1576 - 1590