LOW-LATENCY APPROXIMATION OF BIDIRECTIONAL RECURRENT NETWORKS FOR SPEECH DENOISING

被引:0
|
作者
Wichern, Gordon [1 ]
Lukin, Alexey [1 ]
机构
[1] iZotope Inc, Cambridge, MA 02139 USA
关键词
Speech enhancement; source separation; time-frequency masking; bidirectional recurrent networks; lookahead convolution;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The ability to separate speech from non-stationary background disturbances using only a single channel of information has increased significantly with the adoption of deep learning techniques. In these approaches, a time-frequency mask that recovers clean speech from noisy mixtures is learned from data. Recurrent neural networks are particularly well-suited to this sequential prediction task, with the bidirectional variant (e.g., BLSTM) achieving strong results. The downside of bidirectional models is that they require offline operation to perform both a forward and backward pass over the data. In this paper we compare two different low-latency bidirectional approximations. The first uses block processing with a regular bidirectional network, while the second uses the recently proposed lookahead convolution layer. Our results show that using just 1000 ms of backward context can recover approximately 75% of the performance improvement gained from using bidirectional as opposed to forward-only recurrent networks.
引用
收藏
页码:66 / 70
页数:5
相关论文
共 50 条
  • [1] Amortized Neural Networks for Low-Latency Speech Recognition
    Macoskey, Jonathan
    Strimel, Grant P.
    Su, Jinru
    Rastrow, Ariya
    [J]. INTERSPEECH 2021, 2021, : 4558 - 4562
  • [2] Bidirectional Truncated Recurrent Neural Networks for Efficient Speech Denoising
    Brakel, Philemon
    Stroobandt, Dirk
    Schrauwen, Benjamin
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2972 - 2976
  • [3] Low-Latency Neural Speech Translation
    Niehues, Jan
    Ngoc-Quan Pham
    Thanh-Le Ha
    Sperber, Matthias
    Waibel, Alex
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1293 - 1297
  • [4] Efficient Low-Latency Speech Enhancement with Mobile Audio Streaming Networks
    Romaniuk, Michal
    Masztalski, Piotr
    Piaskowski, Karol
    Matuszewski, Mateusz
    [J]. INTERSPEECH 2020, 2020, : 3296 - 3300
  • [5] LOW-LATENCY DEEP CLUSTERING FOR SPEECH SEPARATION
    Wang, Shanshan
    Naithani, Gaurav
    Virtanen, Tuomas
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 76 - 80
  • [6] Dynamic Transcription for Low-latency Speech Translation
    Niehues, Jan
    Nguyen, Thai Son
    Cho, Eunah
    Ha, Thanh-Le
    Kilgour, Kevin
    Mueller, Markus
    Sperber, Matthias
    Stueker, Sebastian
    Waibel, Alex
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2513 - 2517
  • [7] LEARN Codes: Inventing low-latency codes via recurrent neural networks
    Jiang, Yihan
    Kim, Hyeji
    Asnani, Himanshu
    Kannan, Sreeram
    Oh, Sewoong
    Viswanath, Pramod
    [J]. ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [8] Learn codes: Inventing low-latency codes via recurrent neural networks
    Jiang, Yihan
    Kim, Hyeji
    Asnani, Himanshu
    Kannan, Sreeram
    Oh, Sewoong
    Viswanath, Pramod
    [J]. IEEE Journal on Selected Areas in Information Theory, 2020, 1 (01): : 207 - 216
  • [9] Efficient Recurrent Low-Latency Scheduling in IEEE 802.15.4e TSCH Networks
    Daneels, Glenn
    Latre, Steven
    Famaey, Jeroen
    [J]. 2019 IEEE INTERNATIONAL BLACK SEA CONFERENCE ON COMMUNICATIONS AND NETWORKING (BLACKSEACOM), 2019,
  • [10] EXPLORING TRADEOFFS IN MODELS FOR LOW-LATENCY SPEECH ENHANCEMENT
    Wilson, Kevin
    Chinen, Michael
    Thorpe, Jeremy
    Patton, Brian
    Hershey, John
    Saurous, Rif A.
    Skoglund, Jan
    Lyon, Richard F.
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 366 - 370