COMPACT CONVOLUTIONAL RECURRENT NEURAL NETWORKS VIA BINARIZATION FOR SPEECH EMOTION RECOGNITION

被引:0
|
作者
Zhao, Huan [1 ]
Xiao, Yufeng [1 ]
Han, Jing [2 ]
Zhang, Zixing [1 ,3 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha, Hunan, Peoples R China
[2] Univ Augsburg, Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
[3] Imperial Coll London, Grp Language Audio & Mus, London, England
基金
美国国家科学基金会; 国家重点研发计划;
关键词
binary neural network; compact convolutional recurrent neural network; speech emotion recognition; green computing;
D O I
10.1109/icassp.2019.8683389
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Despite the great advances, most of the recently developed automatic speech recognition systems focus on working in a server-client manner, and thus often require a high computational cost, such as the storage size and memory accesses. This, however, does not satisfy the increasing demand for a succinct model that can run smoothly in embedded devices like smartphones. To this end, in this paper we propose a neural network compression method, in the way of quantizing the weights of the neural networks from the original full-precised values into binary values that then can be stored and processed with only one bit per value. In doing this, the traditional neural network-based large-size speech emotion recognition models can be greatly compressed into smaller ones, which demand lower computational cost. To evaluate the feasibility of the proposed approach, we take a state-of-the-art speech emotion recognition model, i. e., convolutional recurrent neural networks, as an example, and conduct experiments on two widely used emotional databases. We find that the proposed binary neural networks are able to yield a remarkable model compression rate but at limited expense of model performance.
引用
收藏
页码:6690 / 6694
页数:5
相关论文
共 50 条
  • [41] RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    VERDEJO, JED
    HERREROS, AP
    LUNA, JCS
    ORTUZAR, MCB
    AYUSO, AR
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1991, 540 : 361 - 369
  • [42] Emotion Recognition System from Speech and Visual Information based on Convolutional Neural Networks
    Ristea, Nicolae-Catalin
    Dutu, Liviu Cristian
    Radoi, Anamaria
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2019,
  • [43] Speech Emotion Recognition based on Multi-Level Residual Convolutional Neural Networks
    Zheng, Kai
    Xia, ZhiGuang
    Zhang, Yi
    Xu, Xuan
    Fu, Yaqin
    [J]. ENGINEERING LETTERS, 2020, 28 (02) : 559 - 565
  • [44] Speech Emotion Recognition and Deep Learning: An Extensive Validation Using Convolutional Neural Networks
    Ri, Francesco Ardan Dal
    Ciardi, Fabio Cifariello
    Conci, Nicola
    [J]. IEEE ACCESS, 2023, 11 : 116638 - 116649
  • [45] A novel convolutional neural network with gated recurrent unit for automated speech emotion recognition and classification
    Prakash, P. Ravi
    Anuradha, D.
    Iqbal, Javid
    Galety, Mohammad Gouse
    Singh, Ruby
    Neelakandan, S.
    [J]. JOURNAL OF CONTROL AND DECISION, 2023, 10 (01) : 54 - 63
  • [46] Emotion recognition in speech using neural networks
    Nicholson, J
    Takahashi, K
    Nakatsu, R
    [J]. AFFECTIVE MINDS, 2000, : 215 - 220
  • [47] Emotion recognition in speech using neural networks
    Nicholson, J
    Takahashi, K
    Nakatsu, R
    [J]. NEURAL COMPUTING & APPLICATIONS, 2000, 9 (04): : 290 - 296
  • [48] EEG Emotion Recognition using Parallel Hybrid Convolutional-Recurrent Neural Networks
    Putri, Nursilva Aulianisa
    Djamal, Esmeralda Contessa
    Nugraha, Fikri
    Kasyidi, Fatan
    [J]. 2022 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ITS APPLICATIONS (ICODSA), 2022, : 24 - 29
  • [49] Emotion Recognition in Speech Using Neural Networks
    J. Nicholson
    K. Takahashi
    R. Nakatsu
    [J]. Neural Computing & Applications, 2000, 9 : 290 - 296
  • [50] Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns
    Levi, Gil
    Hassner, Tal
    [J]. ICMI'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2015, : 503 - 510