Robust speech separation using time-frequency masking

被引:0
|
作者
Aarabi, P [1 ]
Shi, GJ [1 ]
Jahromi, O [1 ]
机构
[1] Univ Toronto, Artificial Percept Lab, Toronto, ON M5S 3G4, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A multi-microphone time-frequency speech masking technique is proposed. This technique utilizes both the time frequency magnitude and phase information in order to estimate the Signal-to-Noise Ratio (SNR) maximizing masking coefficients for each time-frequency block given that the direction (or alternatively, the time-delay of arrival) of the speaker of interest is known. Using this masking algorithm, speech features (such as formants) from the direction of interest are preserved while features from other directions are severely degraded. Digit recognition experiments indicate that the proposed technique can result in a substantial increase in the digit recognition accuracy rate. At 0dB, for example, the proposed technique results in a digit recognition accuracy rate improvement of 26% over the single microphone case and an improvement of 12% over the two microphone superdirective beamforming case.
引用
收藏
页码:741 / 744
页数:4
相关论文
共 50 条
  • [1] Time-Frequency Masking For Large Scale Robust Speech Recognition
    Wang, Yuxuan
    Misra, Ananya
    Chine, Kean K.
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2469 - 2473
  • [2] Blind separation of speech mixtures via time-frequency masking
    Yilmaz, Ö
    Rickard, S
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (07) : 1830 - 1847
  • [3] Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques
    Kolossa, D
    Klimas, A
    Orglmeister, R
    [J]. 2005 WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2005, : 82 - 85
  • [4] Robust Automatic Speech Recognition System Based on Using Adaptive Time-Frequency Masking
    Gouda, Ahmed Mostafa
    Tamazin, Mohamed
    Khedr, Mohamed
    [J]. PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 181 - 186
  • [5] On time-frequency masking in voiced speech
    Skoglund, J
    Kleijn, WB
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (04): : 361 - 369
  • [6] Label Driven Time-Frequency Masking for Robust Continuous Speech Recognition
    Soni, Meet
    Panda, Ashish
    [J]. INTERSPEECH 2019, 2019, : 426 - 430
  • [7] On the integration of time-frequency masking speech separation and recognition in underdetermined environments
    Jafari, Ingrid
    Haque, Serajul
    Togneri, Roberto
    Nordholm, Sven
    [J]. 2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 1613 - 1617
  • [8] Blind source separation using time-frequency masking
    Mohammed, Abbas
    Ballal, Tarig
    Grbic, Nedelko
    [J]. RADIOENGINEERING, 2007, 16 (04) : 96 - 100
  • [9] Blind speech source separation via nonlinear time-frequency masking
    XU Shun CHEN Shaorong LIU Yulin (DSP Lab.
    [J]. Chinese Journal of Acoustics, 2008, (03) : 203 - 214
  • [10] Online blind speech separation using multiple acoustic speaker tracking and time-frequency masking
    Pertila, P.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 683 - 702