IMPORTANCE OF SWITCH OPTIMIZATION CRITERION IN SWITCHING WPE DEREVERBERATION

被引:1
|
作者
Kamo, Naoyuki [1 ]
Ikeshita, Rintaro [1 ]
Kinoshita, Keisuke [1 ]
Nakatani, Tomohiro [1 ]
机构
[1] NTT Corp, Tokyo, Japan
关键词
Dereverberation; linear prediction (LP); weighted prediction error (WPE); speech recognition; SPEECH DEREVERBERATION; REVERBERATION; EQUALIZATION; MICROPHONE; MIXTURE;
D O I
10.1109/ICASSP43922.2022.9746904
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Weighted prediction error (WPE) is a fundamental dereverberation method to predict the late reverberation component of an observed signal based on linear prediction (LP). Recently, WPE was extended to Switching WPE (SwWPE), which optimizes (i) multiple LP filters and (ii) switching parameters to determine the best LP filter used for each time-frequency bin. Conventionally, these parameters are optimized based on the maximum likelihood (ML) criterion, but this is not optimal in terms of signal quality, such as signal-to-distortion ratio (SDR) and word error rate (WER) of automatic speech recognition. We thus propose a new SwWPE processing flow that enables us to optimize switching parameters based on an arbitrary optimization criterion. Using oracle clean signals, we demonstrate the potential performance of our new approach with an SDR maximization criterion, revealing that it can significantly improve the SDR and WER obtained by the conventional ML-based SwWPE. This motivates us to propose new SwWPE processing in which the switching parameters are externally estimated using a deep neural network (DNN) that is trained with an end-to-end SDR maximization criterion. The experimental result clearly demonstrates the improved SDR performance of the new approach compared to the conventional WPE and SwWPE.
引用
收藏
页码:176 / 180
页数:5
相关论文
共 50 条
  • [1] Robust Speech Dereverberation Based on WPE and Deep Learning
    Li, Hao
    Zhang, Xueliang
    Gao, Guanglai
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 52 - 56
  • [2] FRAME-ONLINE DNN-WPE DEREVERBERATION
    Heymann, Jahn
    Drude, Lukas
    Haeb-Umbach, Reinhold
    Kinoshita, Keisuke
    Nakatani, Tomohiro
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 466 - 470
  • [3] JOINT OPTIMIZATION OF NEURAL NETWORK-BASED WPE DEREVERBERATION AND ACOUSTIC MODEL FOR ROBUST ONLINE ASR
    Heymann, Jahn
    Drude, Lukas
    Haeb-Umbach, Reinhold
    Kinoshita, Keisuke
    Nakatani, Tomohiro
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6655 - 6659
  • [4] AN UNSUPERVISED LEARNING APPROACH TO NEURAL-NET-SUPPORTED WPE DEREVERBERATION
    Petkov, Petko N.
    Tsiaras, Vasileios
    Doddipatla, Rama
    Stylianou, Yannis
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5761 - 5765
  • [5] Neural network-based spectrum estimation for online WPE dereverberation
    Kinoshita, Keisuke
    Delcroix, Marc
    Kwon, Haeyong
    Mori, Takuma
    Nakatani, Tomohiro
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 384 - 388
  • [6] Speech dereverberation and source separation using DNN-WPE and LWPR-PCA
    Sheeja, Jasmine J. C.
    Sankaragomathi, B.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (10): : 7339 - 7356
  • [7] Speech dereverberation and source separation using DNN-WPE and LWPR-PCA
    Jasmine J. C. Sheeja
    B. Sankaragomathi
    [J]. Neural Computing and Applications, 2023, 35 : 7339 - 7356
  • [8] Cascaded Speech Separation Denoising and Dereverberation Using Attention and TCN-WPE Networks for Speech Devices
    Zhang, Xuan
    Tang, Jun
    Cao, Huiliang
    Wang, Chenguang
    Shen, Chong
    Liu, Jun
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (10): : 18047 - 18058
  • [9] On the Enhancement of Dereverberation Algorithms Based on a Perceptual Evaluation Criterion
    Prego, Thiago de M.
    de Lima, Amaro A.
    Netto, Sergio L.
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1359 - 1363
  • [10] Switching Divergences for Spectral Learning in Blind Speech Dereverberation
    Javier Ibarrola, Francisco
    Daniel Spies, Ruben
    Ezequiel Di Persia, Eandro
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (05) : 881 - 891