Robust speech separation using time-frequency masking

被引:0
|
作者
Aarabi, P [1 ]
Shi, GJ [1 ]
Jahromi, O [1 ]
机构
[1] Univ Toronto, Artificial Percept Lab, Toronto, ON M5S 3G4, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A multi-microphone time-frequency speech masking technique is proposed. This technique utilizes both the time frequency magnitude and phase information in order to estimate the Signal-to-Noise Ratio (SNR) maximizing masking coefficients for each time-frequency block given that the direction (or alternatively, the time-delay of arrival) of the speaker of interest is known. Using this masking algorithm, speech features (such as formants) from the direction of interest are preserved while features from other directions are severely degraded. Digit recognition experiments indicate that the proposed technique can result in a substantial increase in the digit recognition accuracy rate. At 0dB, for example, the proposed technique results in a digit recognition accuracy rate improvement of 26% over the single microphone case and an improvement of 12% over the two microphone superdirective beamforming case.
引用
收藏
页码:741 / 744
页数:4
相关论文
共 50 条
  • [21] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
    Yang Yu
    Wenwu Wang
    Peng Han
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2016
  • [22] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
    Yu, Yang
    Wang, Wenwu
    Han, Peng
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016,
  • [23] Maximizing environmental sound recognition and speech intelligibility using time-frequency masking
    Johnson, Eric M.
    Healy, Eric W.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
  • [24] Impact of phase estimation on single-channel speech separation based on time-frequency masking
    Mayer, Florian
    Williamson, Donald S.
    Mowlaee, Pejman
    Wang, DeLiang
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (06): : 4668 - 4679
  • [25] Constructing Time-Frequency Dictionaries for Source Separation via Time-Frequency Masking and Source Localisation
    de Frein, Ruairi
    Rickard, Scott T.
    Pearlmutter, Barak A.
    [J]. INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2009, 5441 : 573 - +
  • [26] Robust digit recognition using phase-dependent time-frequency masking
    Shi, GJ
    Aarabi, P
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 684 - 687
  • [27] Robust digit recognition using phase-dependent time-frequency masking
    Shi, GJ
    Aarabi, P
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 629 - 632
  • [28] An Assessment of the Improvement Potential of Time-Frequency Masking for Speech Dereverberation
    Zheng, Chenxi
    Falk, Tiago H.
    Chan, Wai-Yip
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 212 - +
  • [29] Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising
    Williamson, Donald S.
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1492 - 1501
  • [30] Cepstral representation of speech motivated by time-frequency masking: An application to speech recognition
    Aikawa, K
    Singer, H
    Kawahara, H
    Tohkura, Y
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (01): : 603 - 614