Perceptual effects of noise reduction by time-frequency masking of noisy speech

被引:21
|
作者
Brons, Inge [1 ]
Houben, Rolph [1 ]
Dreschler, Wouter A. [1 ]
机构
[1] Univ Amsterdam, Acad Med Ctr, NL-1105 AZ Amsterdam, Netherlands
来源
关键词
ENHANCEMENT ALGORITHMS; INTELLIGIBILITY;
D O I
10.1121/1.4747006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Time-frequency masking is a method for noise reduction that is based on the time-frequency representation of a speech in noise signal. Depending on the estimated signal-to-noise ratio (SNR), each time-frequency unit is either attenuated or not. A special type of a time-frequency mask is the ideal binary mask (IBM), which has access to the real SNR (ideal). The IBM either retains or removes each time-frequency unit (binary mask). The IBM provides large improvements in speech intelligibility and is a valuable tool for investigating how different factors influence intelligibility. This study extends the standard outcome measure (speech intelligibility) with additional perceptual measures relevant for noise reduction: listening effort, noise annoyance, speech naturalness, and overall preference. Four types of time-frequency masking were evaluated: the original IBM, a tempered version of the IBM (called ITM) which applies limited and non-binary attenuation, and non-ideal masking (also tempered) with two different types of noise-estimation algorithms. The results from ideal masking imply that there is a trade-off between intelligibility and sound quality, which depends on the attenuation strength. Additionally, the results for non-ideal masking suggest that subjective measures can show effects of noise reduction even if noise reduction does not lead to differences in intelligibility. (C) 2012 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4747006]
引用
收藏
页码:2690 / 2699
页数:10
相关论文
共 50 条
  • [1] On time-frequency masking in voiced speech
    Skoglund, J
    Kleijn, WB
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (04): : 361 - 369
  • [2] Speech intelligibility in background noise with ideal binary time-frequency masking
    Wang, DeLiang
    Kjems, Ulrik
    Pedersen, Michael S.
    Boldt, Jesper B.
    Lunner, Thomas
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (04): : 2336 - 2347
  • [3] Review of Time-Frequency Masking Approach for Improving Speech Intelligibility in Noise
    Kim, Gibak
    [J]. IETE TECHNICAL REVIEW, 2022, 39 (03) : 623 - 634
  • [4] Lung Sound Noise Reduction Using Gabor Time-Frequency Masking
    Saatci, E.
    Akan, A.
    [J]. WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING 2006, VOL 14, PTS 1-6, 2007, 14 : 971 - +
  • [5] The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise
    Borrie, Stephanie A.
    Yoho, Sarah E.
    Healy, Eric W.
    Barrett, Tyson S.
    [J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2023, 66 (05): : 1853 - 1866
  • [6] Blind separation of underdetermined Convolutive speech mixtures by time-frequency masking with the reduction of musical noise of separated signals
    Zohrevandi, Mahbanou
    Setayeshi, Saeed
    Rabiee, Azam
    Reshadi, Midia
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) : 12601 - 12618
  • [7] Perceptual time-frequency subtraction algorithm for noise reduction in hearing aids
    Li, M
    McAllister, HG
    Black, ND
    De Pérez, TA
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2001, 48 (09) : 979 - 988
  • [8] Speech Understanding Performance of Cochlear Implant Subjects Using Time-Frequency Masking-Based Noise Reduction
    Qazi, Obaid Ur Rehman
    van Dijk, Bas
    Moonen, Marc
    Wouters, Jan
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2012, 59 (05) : 1364 - 1373
  • [9] Perceptual learning for speech in noise after application of binary time-frequency masks
    Ahmadi, Mahnaz
    Gross, Vauna L.
    Sinex, Donal G.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 133 (03): : 1687 - 1692
  • [10] Robust speech separation using time-frequency masking
    Aarabi, P
    Shi, GJ
    Jahromi, O
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 741 - 744