Sixty Years of Frequency-Domain Monaural Speech Enhancement: From Traditional to Deep Learning Methods

被引:26
|
作者
Zheng, Chengshi [1 ,2 ,4 ,5 ]
Zhang, Huiyong [1 ,2 ]
Liu, Wenzhe [1 ,2 ]
Luo, Xiaoxue [1 ,2 ]
Li, Andong [1 ,2 ]
Li, Xiaodong [1 ,2 ]
Moore, Brian C. J. [3 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Noise & Vibrat Res, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Univ Cambridge, Dept Psychol, Cambridge Hearing Grp, Cambridge, England
[4] Chinese Acad Sci, Inst Acoust, Key Lab Noise & Vibrat Res, Beijing 100190, Peoples R China
[5] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
来源
TRENDS IN HEARING | 2023年 / 27卷
关键词
speech enhancement; speech dereverberation; multistage learning; noise estimation; deep complex network; GENERALIZED SPECTRAL SUBTRACTION; NOISE-REDUCTION ALGORITHM; RECURRENT NEURAL-NETWORKS; SQUARE ERROR ESTIMATION; HEARING-AID DELAYS; STATISTICAL-MODEL; SOURCE SEPARATION; MMSE ESTIMATOR; MUSICAL NOISE; PHASE;
D O I
10.1177/23312165231209913
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Frequency-domain monaural speech enhancement has been extensively studied for over 60 years, and a great number of methods have been proposed and applied to many devices. In the last decade, monaural speech enhancement has made tremendous progress with the advent and development of deep learning, and performance using such methods has been greatly improved relative to traditional methods. This survey paper first provides a comprehensive overview of traditional and deep-learning methods for monaural speech enhancement in the frequency domain. The fundamental assumptions of each approach are then summarized and analyzed to clarify their limitations and advantages. A comprehensive evaluation of some typical methods was conducted using the WSJ + Deep Noise Suppression (DNS) challenge and Voice Bank + DEMAND datasets to give an intuitive and unified comparison. The benefits of monaural speech enhancement methods using objective metrics relevant for normal-hearing and hearing-impaired listeners were evaluated. The objective test results showed that compression of the input features was important for simulated normal-hearing listeners but not for simulated hearing-impaired listeners. Potential future research and development topics in monaural speech enhancement are suggested.
引用
收藏
页数:52
相关论文
共 50 条
  • [21] Adaptive β-order perceptually motivated speech enhancement algorithm based on frequency-domain auditory masking
    Wang, Yue
    Li, Ping
    Cui, Jie
    Shengxue Xuebao/Acta Acustica, 2013, 38 (04): : 501 - 508
  • [22] Two-Stage Learning and Fusion Network With Noise Aware for Time-Domain Monaural Speech Enhancement
    Xiang, Xiaoxiao
    Zhang, Xiaojuan
    Chen, Haozhe
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1754 - 1758
  • [23] Fast Frequency-Domain Analysis for Parametric Electromagnetic Models Using Deep Learning
    Mattucci, Elia
    Feng, Lihong
    Benner, Peter
    Romano, Daniele
    Antonini, Giulio
    2023 IEEE 32ND CONFERENCE ON ELECTRICAL PERFORMANCE OF ELECTRONIC PACKAGING AND SYSTEMS, EPEPS, 2023,
  • [24] Application research on vector coherent frequency-domain batch adaptive line enhancement in deep water
    Li, He
    Wang, Tong
    Guo, Xinyi
    Su, Lin
    Mo, Yaxiao
    IET RADAR SONAR AND NAVIGATION, 2024, 18 (10): : 1859 - 1873
  • [25] An efficient frequency-domain adaptive forward BSS algorithm for acoustic noise reduction and speech quality enhancement
    Djendi, Mohamed
    COMPUTERS & ELECTRICAL ENGINEERING, 2016, 52 : 12 - 27
  • [26] Accelerating 2D and 3D frequency-domain seismic wave modeling through interpolating frequency-domain wavefields by deep learning
    Cao, Wenzhong
    Li, Quanli
    Zhang, Jie
    Zhang, Wei
    GEOPHYSICS, 2022, 87 (04) : T315 - T328
  • [27] Universal Approximation Theorem and Deep Learning for the Solution of Frequency-Domain Electromagnetic Scattering Problems
    Wang, Ji-Yuan
    Pan, Xiao-Min
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2024, 72 (12) : 9274 - 9285
  • [28] Deep Learning Assisted Time-Frequency Processing for Speech Enhancement on Drones
    Wang, Lin
    Cavallaro, Andrea
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (06): : 871 - 881
  • [29] DC series arc diagnosis based on deep-learning algorithm with frequency-domain characteristics
    Jae-Yoon Jeong
    Jae-Chang Kim
    Sangshin Kwak
    Journal of Power Electronics, 2021, 21 : 1900 - 1909
  • [30] DC series arc diagnosis based on deep-learning algorithm with frequency-domain characteristics
    Jeong, Jae-Yoon
    Kim, Jae-Chang
    Kwak, Sangshin
    JOURNAL OF POWER ELECTRONICS, 2021, 21 (12) : 1900 - 1909