Deep Learning-Based Amplitude Fusion for Speech Dereverberation

被引:1
|
作者
Liu, Chunlei [1 ,2 ]
Wang, Longbiao [2 ]
Dang, Jianwu [2 ,3 ]
机构
[1] Dezhou Univ, Sch Comp & Informat, Dezhou 253023, Peoples R China
[2] Tianjin Univ, Coll Intelligence & Comp, Tianjin Key Lab Cognit Comp & Applicat, Tianjin 300350, Peoples R China
[3] Japan Adv Inst Sci & Technol, Nomi, Ishikawa 9231292, Japan
基金
中国国家自然科学基金;
关键词
MASKING; SEGREGATION; SEPARATION; FEATURES; NOISE;
D O I
10.1155/2020/4618317
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Mapping and masking are two important speech enhancement methods based on deep learning that aim to recover the original clean speech from corrupted speech. In practice, too large recovery errors severely restrict the improvement in speech quality. In our preliminary experiment, we demonstrated that mapping and masking methods had different conversion mechanisms and thus assumed that their recovery errors are highly likely to be complementary. Also, the complementarity was validated accordingly. Based on the principle of error minimization, we propose the fusion between mapping and masking for speech dereverberation. Specifically, we take the weighted mean of the amplitudes recovered by the two methods as the estimated amplitude of the fusion method. Experiments verify that the recovery error of the fusion method is further controlled. Compared with the existing geometric mean method, the weighted mean method we proposed has achieved better results. Speech dereverberation experiments manifest that the weighted mean method improves PESQ and SNR by 5.8% and 25.0%, respectively, compared with the traditional masking method.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Robust Speech Dereverberation Based on WPE and Deep Learning
    Li, Hao
    Zhang, Xueliang
    Gao, Guanglai
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 52 - 56
  • [2] Deep Learning Based Target Cancellation for Speech Dereverberation
    Wang, Zhong-Qiu
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 941 - 950
  • [3] SPEECH DEREVERBERATION BASED ON INTEGRATED DEEP AND ENSEMBLE LEARNING ALGORITHM
    Lee, Wei-Jen
    Wang, Syu-Siang
    Chen, Fei
    Lu, Xugang
    Chien, Shao-Yi
    Tsao, Yu
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5454 - 5458
  • [4] Deep Learning-Based Dereverberation for Sound Source Localization with Beamforming
    Zhai, Qingbo
    Ning, Fangli
    Hou, Hongjie
    Wei, Juan
    Su, Zhaojing
    [J]. JOURNAL OF THEORETICAL AND COMPUTATIONAL ACOUSTICS, 2024, 32 (01):
  • [5] Deep Learning Based Dereverberation of Temporal Envelopes for Robust Speech Recognition
    Purushothaman, Anurenjan
    Sreeram, Anirudh
    Kumar, Rohit
    Ganapathy, Sriram
    [J]. INTERSPEECH 2020, 2020, : 1688 - 1692
  • [6] On the Robustness of Deep Learning-Based Speech Enhancement
    Chhetri, Amit S.
    Hilmes, Philip
    Athi, Mrudula
    Shankar, Nikhil
    [J]. 2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1587 - 1594
  • [7] Target exaggeration for deep learning-based speech enhancement
    Kim, Hansol
    Shin, Jong Won
    [J]. DIGITAL SIGNAL PROCESSING, 2021, 116
  • [8] Deep Learning-based Telephony Speech Recognition in the Wild
    Han, Kyu J.
    Hahm, Seongjun
    Kim, Byung-Hak
    Kim, Jungsuk
    Lane, Ian
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1323 - 1327
  • [9] Learning Feature Fusion in Deep Learning-Based Object Detector
    Hassan, Ehtesham
    Khalil, Yasser
    Ahmad, Imtiaz
    [J]. JOURNAL OF ENGINEERING, 2020, 2020
  • [10] LDCCRN: Robust Deep Learning-based Speech Enhancement
    Yeung, Chun-Yin
    Mung, Steve W. Y.
    Choy, Yat Sze
    Lun, Daniel P. K.
    [J]. INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2022, 2022, 12177