Separation and deconvolution of speech using recurrent neural networks

被引:0
|
作者
Li, Y [1 ]
Powers, D [1 ]
Wen, P [1 ]
机构
[1] Flinders Univ S Australia, Sch Informat & Engn, Adelaide, SA 5001, Australia
关键词
blind signal/source separation; speech recognition; recurrent neural networks; 2D system theory; output decorrelation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on improvements of the Speech Recognition or Speech Reading (SR), due to combining multiple auditory sources. We present results obtained in the traditional Blind Signal Separation & Deconvolution (BSS) paradigm using two speaker signals from the perspective of two sources, investigating artificial linear and convolutive mixes as well as real recordings. The adaptive algorithm is based on two-dimensional (2D) system theory using recurrent neural networks (RNNs). The characteristics of convolutively mixed signals (eg. audio signals) are matched by the structure of RNNs. The feedback paths in an RNN permit the possibility of a memory of the signals at relevant delays so that better separation can be achieved. The cross-correlations of the outputs of the RNN are used as separation criterion.
引用
收藏
页码:1303 / 1309
页数:7
相关论文
共 50 条
  • [41] An Improved Supervised Speech Separation Method Based on Perceptual Weighted Deep Recurrent Neural Networks
    Han, Wei
    Zhang, Xiongwei
    Sun, Meng
    Li, Li
    Shi, Wenhua
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (02) : 718 - 721
  • [42] Recurrent timing neural networks for joint F0-localisation based speech separation
    Wrigley, Stuart N.
    Brown, Guy J.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 157 - 160
  • [43] RECURRENT DEEP STACKING NETWORKS FOR SUPERVISED SPEECH SEPARATION
    Wang, Zhong-Qiu
    Wang, DeLiang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 71 - 75
  • [44] Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks
    Li, Jingdong
    Zhang, Hui
    Zhang, Xueliang
    Li, Changliang
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 896 - 900
  • [45] Recent advances in conversational speech recognition using conventional and recurrent neural networks
    Saon, G.
    Picheny, M.
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2017, 61 (4-5)
  • [46] AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION
    Mirsamadi, Seyedmahdad
    Barsoum, Emad
    Zhang, Cha
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2227 - 2231
  • [47] Dysarthria Speech Detection Using Convolutional Neural Networks with Gated Recurrent Unit
    Shih, Dong-Her
    Liao, Ching-Hsien
    Wu, Ting-Wei
    Xu, Xiao-Yin
    Shih, Ming-Hung
    [J]. HEALTHCARE, 2022, 10 (10)
  • [48] Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks
    Gu, Yu
    Ling, Zhen-Hua
    Dai, Li-Rong
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 297 - 301
  • [49] Segment-Based Speech Emotion Recognition Using Recurrent Neural Networks
    Tzinis, Efthymios
    Potamianos, Alexandros
    [J]. 2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 190 - 195
  • [50] Single Channel Speech Source Separation Using Hierarchical Deep Neural Networks
    Noorani, Seyed Majid
    Seyedin, Sanaz
    [J]. 2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 466 - 470