FusionRNN: Shared Neural Parameters for Multi-Channel Distant Speech Recognition

被引:0
|
作者
Parcollet, Titouan [1 ]
Qiu, Xinchi [1 ]
Lane, Nicholas D. [1 ,2 ]
机构
[1] Univ Oxford, Oxford, England
[2] Samsung AI, Cambridge, England
来源
关键词
Multi-channel distant speech recognition; shared neural parameters; light gated recurrent unit neural networks; NETWORKS;
D O I
10.21437/Interspeech.2020-2102
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Distant speech recognition remains a challenging application for modern deep learning based Automatic Speech Recognition (ASR) systems, due to complex recording conditions involving noise and reverberation. Multiple microphones are commonly combined with well-known speech processing techniques to enhance the original signals and thus enhance the speech recognizer performance. These multi-channel follow similar input distributions with respect to the global speech information but also contain an important part of noise. Consequently, the input representation robustness is key to obtaining reasonable recognition rates. In this work, we propose a Fusion Layer (FL) based on shared neural parameters. We use it to produce an expressive embedding of multiple microphone signals, that can easily be combined with any existing ASR pipeline. The proposed model called FusionRNN showed promising results on a multi-channel distant speech recognition task, and consistently outperformed baseline models while maintaining an equal training time.
引用
收藏
页码:1678 / 1682
页数:5
相关论文
共 50 条
  • [21] Multi-channel Attention for End-to-End Speech Recognition
    Braun, Stefan
    Neil, Daniel
    Anumula, Jithendar
    Ceolini, Enea
    Liu, Shih-Chii
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 17 - 21
  • [22] Audio-visual Multi-channel Recognition of Overlapped Speech
    Yu, Jianwei
    Wu, Bo
    Gu, Rongzhi
    Zhang, Shi-Xiong
    Chen, Lianwu
    Xu, Yong
    Yu, Meng
    Su, Dan
    Yu, Dong
    Liu, Xunying
    Meng, Helen
    INTERSPEECH 2020, 2020, : 3496 - 3500
  • [23] The segmentation of multi-channel meeting recordings for automatic speech recognition
    Dines, John
    Vepa, Jithendra
    Hain, Thomas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1213 - +
  • [24] MULTI-CHANNEL OVERLAPPED SPEECH RECOGNITION WITH LOCATION GUIDED SPEECH EXTRACTION NETWORK
    Chen, Zhuo
    Xiao, Xiong
    Yoshioka, Takuya
    Erdogan, Hakan
    Li, Jinyu
    Gong, Yifan
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 558 - 565
  • [25] A MULTI-CHANNEL CORPUS FOR DISTANT-SPEECH INTERACTION IN PRESENCE OF KNOWN INTERFERENCES
    Zwyssig, Erich
    Ravanelli, Mirco
    Svaizer, Piergiorgio
    Omologo, Maurizio
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4480 - 4484
  • [26] MULTI-CHANNEL SPEECH ENHANCEMENT USING GRAPH NEURAL NETWORKS
    Tzirakis, Panagiotis
    Kumar, Anurag
    Donley, Jacob
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3415 - 3419
  • [27] NEURAL NETWORKS FOR DISTANT SPEECH RECOGNITION
    Renals, Steve
    Swietojanski, Pawel
    2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 172 - 176
  • [28] A unified network for multi-speaker speech recognition with multi-channel recordings
    Liu, Conggui
    Inoue, Nakamasa
    Shinoda, Koichi
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1304 - 1307
  • [29] Deep Neural Network-Based Generalized Sidelobe Canceller for Robust Multi-channel Speech Recognition
    Li, Guanjun
    Liang, Shan
    Nie, Shuai
    Liu, Wenju
    Yang, Zhanlei
    Xiao, Longshuai
    INTERSPEECH 2020, 2020, : 51 - 55
  • [30] Speech distortion weighted multi-channel Wiener filter and its application to speech recognition
    Kim, Gibak
    IEICE ELECTRONICS EXPRESS, 2015, 12 (06): : 1 - 7