Convolutional Neural Turing Machine for Speech Separation

被引:0
|
作者
Chien, Jen-Tzung [1 ]
Tsou, Kai-Wei [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu, Taiwan
关键词
Recurrent neural network; convolutional neural network; neural Turing machine; monaural speech separation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long short-term memory (LSTM) has been successfully developed for monaural speech separation. Temporal information is learned by using dynamic states which are evolved through time and stored as an internal memory. The spectro-temporal data matrix of mixed signal is flattened as input vectors. There are twofold limitations. First, the internal memory in LSTM could not sufficiently characterize long-term information from different sources. Second, the temporal correlation and frequency neighboring in the flattened vectors were smeared. To deal with these limitations, this paper presents a convolutional neural Turing machine (ConvNTM) where the feature maps of spectro-temporal data are extracted and embedded in an external memory at each time step. ConvNTM aims to preserve the spectro-temporal structure in long sequential signals which is exploited to estimate the separated spectral signals. An addressing mechanism is introduced to continuously calculate the read and write heads to retrieve and update memory slots, respectively. The memory augmented source separation is implemented for single-channel speech enhancement. Experimental results illustrate the superiority of ConvNTM to LSTM, NTM and convolutional LSTM for speech enhancement in terms of short-term objective intelligibility measure.
引用
收藏
页码:81 / 85
页数:5
相关论文
共 50 条
  • [1] Convolutional Maxout Neural Networks for Speech Separation
    Hui, Like
    Cai, Meng
    Guo, Cong
    He, Liang
    Zhang, Wei-Qiang
    Liu, Jia
    2015 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2015, : 24 - 27
  • [2] Speech Separation Using Convolutional Neural Network and Attention Mechanism
    Yuan, Chun-Miao
    Sun, Xue-Mei
    Zhao, Hu
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2020, 2020
  • [3] Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
    Hu, Xiaolin
    Li, Kai
    Zhang, Weiyi
    Luo, Yi
    Lemercier, Jean-Marie
    Gerkmann, Timo
    Advances in Neural Information Processing Systems, 2021, 27 : 22509 - 22522
  • [4] Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
    Hu, Xiaolin
    Li, Kai
    Zhang, Weiyi
    Luo, Yi
    Lemercier, Jean-Marie
    Gerkmann, Timo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [5] Speech separation using an asynchronous Fully Recurrent Convolutional Neural Network
    Department of Computer Science and Technology, Tsinghua Laboratory of Brain and Intelligence , IDG/McGovern Institute of Brain Research, Tsinghua University, Beijing, China
    不详
    不详
    arXiv, 2021,
  • [6] A Review on Neural Turing Machine (NTM)
    Malekmohamadi Faradonbe S.
    Safi-Esfahani F.
    Karimian-kelishadrokhi M.
    SN Computer Science, 2020, 1 (6)
  • [7] A convolutional recurrent neural network with attention framework for speech separation in monaural recordings
    Chao Sun
    Min Zhang
    Ruijuan Wu
    Junhong Lu
    Guo Xian
    Qin Yu
    Xiaofeng Gong
    Ruisen Luo
    Scientific Reports, 11
  • [8] A convolutional recurrent neural network with attention framework for speech separation in monaural recordings
    Sun, Chao
    Zhang, Min
    Wu, Ruijuan
    Lu, Junhong
    Xian, Guo
    Yu, Qin
    Gong, Xiaofeng
    Luo, Ruisen
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [9] A HYBRID CONVOLUTIONAL NEURAL NETWORK AND SUPPORT VECTOR MACHINE FOR DYSARTHRIA SPEECH CLASSIFICATION
    Dyoniputri, Hanifia
    Afiahayati
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2021, 17 (01): : 111 - 123
  • [10] Ternary-task convolutional bidirectional neural turing machine for assessment of EEG-based cognitive workload
    Qiao, Weizheng
    Bi, Xiaojun
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2020, 57