AN UNSUPERVISED LEARNING APPROACH TO NEURAL-NET-SUPPORTED WPE DEREVERBERATION

被引:0
|
作者
Petkov, Petko N. [1 ]
Tsiaras, Vasileios [2 ]
Doddipatla, Rama [1 ]
Stylianou, Yannis [2 ]
机构
[1] Toshiba Res Europe Ltd, Cambridge, England
[2] Univ Crete, Iraklion, Greece
关键词
reverberation; speech enhancement; neural network; automatic speech recognition; SPEECH DEREVERBERATION; REVERBERANT; INTELLIGIBILITY; RECOGNITION; END;
D O I
10.1109/icassp.2019.8683542
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Reverberation degrades signal quality and increases word error rates in automatic speech recognition (ASR). Reverberation suppression is, thus, a key component in listening enhancement devices and ASR front end. The weighted prediction error (WPE) is a prominent and effective method that gained popularity in recent ASR challenges. The need for iterative optimization in WPE leads to high computational cost and instabilities for short signals. Neural net (NN) supported WPE was proposed to alleviate these issues. However, NN training requires parallel data, i.e., reverberant and "clean" (direct sound plus early reflections) speech, which is not available in general. We show that the supporting network can be trained efficiently, without any supervision, using reverberant speech only. Consequently, adaptation to unseen environments is largely simplified. Network training involves the complete de-reverberation system and relies on complex-valued back propagation. The experimental validation confirms that, the proposed approach matches the performance of the method with parallel training data both in terms of perceptual quality and ASR word error rates.
引用
收藏
页码:5761 / 5765
页数:5
相关论文
共 50 条
  • [1] Robust Speech Dereverberation Based on WPE and Deep Learning
    Li, Hao
    Zhang, Xueliang
    Gao, Guanglai
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 52 - 56
  • [2] Neural network-based spectrum estimation for online WPE dereverberation
    Kinoshita, Keisuke
    Delcroix, Marc
    Kwon, Haeyong
    Mori, Takuma
    Nakatani, Tomohiro
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 384 - 388
  • [3] USDnet: Unsupervised Speech Dereverberation via Neural Forward Filtering
    Wang, Zhong-Qiu
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3882 - 3895
  • [4] JOINT OPTIMIZATION OF NEURAL NETWORK-BASED WPE DEREVERBERATION AND ACOUSTIC MODEL FOR ROBUST ONLINE ASR
    Heymann, Jahn
    Drude, Lukas
    Haeb-Umbach, Reinhold
    Kinoshita, Keisuke
    Nakatani, Tomohiro
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6655 - 6659
  • [5] PC-Net: Unsupervised Point Correspondence Learning with Neural Networks
    Li, Xiang
    Wang, Lingjing
    Fang, Yi
    [J]. 2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 145 - 154
  • [6] Variational approach to unsupervised learning algorithms of neural networks
    Likhovidov, V
    [J]. NEURAL NETWORKS, 1997, 10 (02) : 273 - 289
  • [7] An unsupervised learning based neural network approach for a robotic manipulator
    Mahajan A.
    Singh H.P.
    Sukavanam N.
    [J]. International Journal of Information Technology, 2017, 9 (1) : 1 - 6
  • [8] Unsupervised learning in neural computation
    Oja, E
    [J]. THEORETICAL COMPUTER SCIENCE, 2002, 287 (01) : 187 - 207
  • [9] VACE-WPE: Virtual Acoustic Channel Expansion Based on Neural Networks for Weighted Prediction Error-Based Speech Dereverberation
    Yang, Joon-Young
    Chang, Joon-Hyuk
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 174 - 189
  • [10] DRC-NET: DENSELY CONNECTED RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR SPEECH DEREVERBERATION
    Liu, Jinjiang
    Zhang, Xueliang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 166 - 170