On phase recovery and preserving early reflections for deep-learning speech dereverberation

被引:1
|
作者
Luo, Xiaoxue [1 ,2 ]
Ke, Yuxuan [1 ,2 ]
Li, Xiaodong [1 ,2 ]
Zheng, Chengshi [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Noise & Vibrat Res, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
SEPARATION; NETWORKS; INTELLIGIBILITY; REVERBERATION; ALGORITHM; MASKING;
D O I
10.1121/10.0024348
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In indoor environments, reverberation often distorts clean speech. Although deep learning-based speech dereverberation approaches have shown much better performance than traditional ones, the inferior speech quality of the dereverberated speech caused by magnitude distortion and limited phase recovery is still a serious problem for practical applications. This paper improves the performance of deep learning-based speech dereverberation from the perspectives of both network design and mapping target optimization. Specifically, on the one hand, a bifurcated-and-fusion network and its guidance loss functions were designed to help reduce the magnitude distortion while enhancing the phase recovery. On the other hand, the time boundary between the early and late reflections in the mapped speech was investigated, so as to make a balance between the reverberation tailing effect and the difficulty of magnitude/phase recovery. Mathematical derivations were provided to show the rationality of the specially designed loss functions. Geometric illustrations were given to explain the importance of preserving early reflections in reducing the difficulty of phase recovery. Ablation study results confirmed the validity of the proposed network topology and the importance of preserving 20 ms early reflections in the mapped speech. Objective and subjective test results showed that the proposed system outperformed other baselines in the speech dereverberation task.
引用
收藏
页码:436 / 451
页数:16
相关论文
共 50 条
  • [1] Deep-Learning Framework for Efficient Real-Time Speech Enhancement and Dereverberation
    Rosenbaum, Tomer
    Winebrand, Emil
    Cohen, Omer
    Cohen, Israel
    SENSORS, 2025, 25 (03)
  • [2] Robust Speech Dereverberation Based on WPE and Deep Learning
    Li, Hao
    Zhang, Xueliang
    Gao, Guanglai
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 52 - 56
  • [3] Deep Learning Based Target Cancellation for Speech Dereverberation
    Wang, Zhong-Qiu
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 941 - 950
  • [4] SPEECH DEREVERBERATION BASED ON INTEGRATED DEEP AND ENSEMBLE LEARNING ALGORITHM
    Lee, Wei-Jen
    Wang, Syu-Siang
    Chen, Fei
    Lu, Xugang
    Chien, Shao-Yi
    Tsao, Yu
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5454 - 5458
  • [5] Deep Learning-Based Amplitude Fusion for Speech Dereverberation
    Liu, Chunlei
    Wang, Longbiao
    Dang, Jianwu
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2020, 2020
  • [6] An Integrated Deep Learning Model for Concurrent Speech Dereverberation and Denoising
    Mane, Vijay M.
    Arote, Seema S.
    Shaikh, Shakil A.
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2024, 15 (02) : 281 - 287
  • [7] Deep Learning Based Dereverberation of Temporal Envelopes for Robust Speech Recognition
    Purushothaman, Anurenjan
    Sreeram, Anirudh
    Kumar, Rohit
    Ganapathy, Sriram
    INTERSPEECH 2020, 2020, : 1688 - 1692
  • [8] Classifying aircraft based on sparse recovery and deep-learning
    Wang Wenying
    Wei Yao
    Zhen Xuanxuan
    Yu Hui
    Wang Ruqi
    JOURNAL OF ENGINEERING-JOE, 2019, 2019 (21): : 7464 - 7468
  • [9] Classifying aircraft based on sparse recovery and deep-learning
    Wenying, Wang
    Yao, Wei
    Xuanxuan, Zhen
    Hui, Yu
    Ruqi, Wang
    Journal of Engineering, 2019, 2019 (21): : 7464 - 7468
  • [10] An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition
    Wu, Bo
    Li, Kehuang
    Ge, Fengpei
    Huang, Zhen
    Yang, Minglei
    Siniscalchi, Sabato Marco
    Lee, Chin-Hui
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2017, 11 (08) : 1289 - 1300