On phase recovery and preserving early reflections for deep-learning speech dereverberation

被引:1
|
作者
Luo, Xiaoxue [1 ,2 ]
Ke, Yuxuan [1 ,2 ]
Li, Xiaodong [1 ,2 ]
Zheng, Chengshi [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Noise & Vibrat Res, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
SEPARATION; NETWORKS; INTELLIGIBILITY; REVERBERATION; ALGORITHM; MASKING;
D O I
10.1121/10.0024348
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In indoor environments, reverberation often distorts clean speech. Although deep learning-based speech dereverberation approaches have shown much better performance than traditional ones, the inferior speech quality of the dereverberated speech caused by magnitude distortion and limited phase recovery is still a serious problem for practical applications. This paper improves the performance of deep learning-based speech dereverberation from the perspectives of both network design and mapping target optimization. Specifically, on the one hand, a bifurcated-and-fusion network and its guidance loss functions were designed to help reduce the magnitude distortion while enhancing the phase recovery. On the other hand, the time boundary between the early and late reflections in the mapped speech was investigated, so as to make a balance between the reverberation tailing effect and the difficulty of magnitude/phase recovery. Mathematical derivations were provided to show the rationality of the specially designed loss functions. Geometric illustrations were given to explain the importance of preserving early reflections in reducing the difficulty of phase recovery. Ablation study results confirmed the validity of the proposed network topology and the importance of preserving 20 ms early reflections in the mapped speech. Objective and subjective test results showed that the proposed system outperformed other baselines in the speech dereverberation task.
引用
收藏
页码:436 / 451
页数:16
相关论文
共 50 条
  • [41] OBSTransformer: a deep-learning seismic phase picker for OBS data using automated labelling and transfer learning
    Niksejel, Alireza
    Zhang, Miao
    GEOPHYSICAL JOURNAL INTERNATIONAL, 2024, 237 (01) : 485 - 505
  • [42] Early detection of Wheat Stripe Mosaic Virus using multispectral imaging with deep-learning
    De Silva, Malithi
    Brown, Dane
    ECOLOGICAL INFORMATICS, 2025, 87
  • [43] A Deep-learning Approach for Modeling Phase-change Metasurface in the Mid-infrared
    Negm, Ayman
    Bakr, Mohamed
    Howlader, Matiar
    Ali, Shirook
    2021 INTERNATIONAL APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY SYMPOSIUM (ACES), 2021,
  • [44] Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking
    S. Mostafa Mousavi
    William L. Ellsworth
    Weiqiang Zhu
    Lindsay Y. Chuang
    Gregory C. Beroza
    Nature Communications, 11
  • [45] Ensemble Deep-Learning Model for Phase-Resolved Partial Discharge Diagnosis in Hydrogenerators
    Zemouri, Ryad
    Levesque, Melanie
    IEEE TRANSACTIONS ON DIELECTRICS AND ELECTRICAL INSULATION, 2023, 30 (05) : 2394 - 2401
  • [46] Enhancing deep-learning training for phase identification in powder X-ray diffractograms
    Schuetzke, Jan
    Benedix, Alexander
    Mikut, Ralf
    Reischl, Markus
    IUCRJ, 2021, 8 : 408 - 420
  • [47] Phase retrieval without phase unwrapping for white blood cells in deep-learning phase-shifting digital holography
    Jin, Shuyang
    Xu, Xiaoqing
    Chen, Jili
    Ni, Yudan
    OPTICA APPLICATA, 2023, 53 (01) : 127 - 140
  • [48] Privacy Preserving Radio Frequency Speech Sensing With Deep Learning Towards Improved Hearing Aids
    Reay, Michaela
    Hameed, Hira
    Imran, Muhammad Ali
    Abbasi, Qammer H.
    2024 IEEE INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION AND INC/USNCURSI RADIO SCIENCE MEETING, AP-S/INC-USNC-URSI 2024, 2024, : 2387 - 2388
  • [49] Deep-Learning Generated Synthetic Double Inversion Recovery Images Improve Multiple Sclerosis Lesion Detection
    Finck, Tom
    Li, Hongwei
    Grundl, Lioba
    Eichinger, Paul
    Bussas, Matthias
    Muehlau, Mark
    Menze, Bjoern
    Wiestler, Benedikt
    INVESTIGATIVE RADIOLOGY, 2020, 55 (05) : 318 - 323
  • [50] Applying a deep-learning surrogate model to simulate and compare achievable oil recovery by different waterflood scenarios
    Singh, Vishal
    Ruwali, Nabindra
    Pandey, Rakesh Kumar
    Vaferi, Behzad
    Wood, David A.
    PETROLEUM SCIENCE AND TECHNOLOGY, 2024, 42 (25) : 4405 - 4423