Robust speech separation using time-frequency masking

被引：0

作者：

Aarabi, P ^{[1
]}

Shi, GJ ^{[1
]}

Jahromi, O ^{[1
]}

机构：

[1] Univ Toronto, Artificial Percept Lab, Toronto, ON M5S 3G4, Canada

来源：

2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS | 2003年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A multi-microphone time-frequency speech masking technique is proposed. This technique utilizes both the time frequency magnitude and phase information in order to estimate the Signal-to-Noise Ratio (SNR) maximizing masking coefficients for each time-frequency block given that the direction (or alternatively, the time-delay of arrival) of the speaker of interest is known. Using this masking algorithm, speech features (such as formants) from the direction of interest are preserved while features from other directions are severely degraded. Digit recognition experiments indicate that the proposed technique can result in a substantial increase in the digit recognition accuracy rate. At 0dB, for example, the proposed technique results in a digit recognition accuracy rate improvement of 26% over the single microphone case and an improvement of 12% over the two microphone superdirective beamforming case.

引用

页码：741 / 744

页数：4

共 50 条

[21] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
Yang Yu
Wenwu Wang
Peng Han
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2016
[22] Localization based stereo speech source separation using probabilistic time-frequency masking and deep neural networks
Yu, Yang
Wang, Wenwu
Han, Peng
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016,
[23] Maximizing environmental sound recognition and speech intelligibility using time-frequency masking
Johnson, Eric M.
Healy, Eric W.
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
[24] Impact of phase estimation on single-channel speech separation based on time-frequency masking
Mayer, Florian
Williamson, Donald S.
Mowlaee, Pejman
Wang, DeLiang
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (06): : 4668 - 4679
[25] Constructing Time-Frequency Dictionaries for Source Separation via Time-Frequency Masking and Source Localisation
de Frein, Ruairi
Rickard, Scott T.
Pearlmutter, Barak A.
[J]. INDEPENDENT COMPONENT ANALYSIS AND SIGNAL SEPARATION, PROCEEDINGS, 2009, 5441 : 573 - +
[26] Robust digit recognition using phase-dependent time-frequency masking
Shi, GJ
Aarabi, P
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 684 - 687
[27] Robust digit recognition using phase-dependent time-frequency masking
Shi, GJ
Aarabi, P
[J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 629 - 632
[28] An Assessment of the Improvement Potential of Time-Frequency Masking for Speech Dereverberation
Zheng, Chenxi
Falk, Tiago H.
Chan, Wai-Yip
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 212 - +
[29] Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising
Williamson, Donald S.
Wang, DeLiang
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1492 - 1501
[30] Cepstral representation of speech motivated by time-frequency masking: An application to speech recognition
Aikawa, K
Singer, H
Kawahara, H
Tohkura, Y
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (01): : 603 - 614

← 1 2 3 4 5 →