On the integration of time-frequency masking speech separation and recognition in underdetermined environments

被引：0

作者：

Jafari, Ingrid ^{[1
]}

Haque, Serajul ^{[1
]}

Togneri, Roberto ^{[1
]}

Nordholm, Sven ^{[2
]}

机构：

[1] Univ Western Australia, Crawley, WA 6009, Australia

[2] Curtin Univ, Perth, WA 6845, Australia

来源：

2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR) | 2012年

基金：

澳大利亚研究理事会;

关键词：

SPARSE SOURCE SEPARATION; BLIND;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The successful application of automatic speech recognition systems in the real world is conditional on its ability to handle realistic environments with unfavorable conditions such as reverberation and multiple sources of inteference. Previous research has identified time-frequency masking based approaches to blind source separation as a viable approach for multisource reverberant source separation. It is proposed the use of such separation techniques as a front-end to speech recognition will encourage greater recognition accuracy. Experimental evaluations confirmed the hypothesis with an improvement in recognition accuracy of over 20% at a reverberation time of RT60 = 300ms; this is indicative of the potential for future research in this field.

引用

页码：1613 / 1617

页数：5

共 50 条

[21] Independent Component Analysis and Time-Frequency Masking for Speech Recognition in Multitalker Conditions
Dorothea Kolossa
Ramon Fernandez Astudillo
Eugen Hoffmann
Reinhold Orglmeister
EURASIP Journal on Audio, Speech, and Music Processing, 2010
[22] Independent Component Analysis and Time-Frequency Masking for Speech Recognition in Multitalker Conditions
Kolossa, Dorothea
Astudillo, Ramon Fernandez
Hoffmann, Eugen
Orglmeister, Reinhold
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2010,
[23] Label-Driven Time-Frequency Masking for Robust Speech Command Recognition
Soni, Meet
Sheikh, Imran
Kopparapu, Sunil Kumar
TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 341 - 351
[24] Maximizing environmental sound recognition and speech intelligibility using time-frequency masking
Johnson, Eric M.
Healy, Eric W.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
[25] Underdetermined blind separation of overlapped speech mixtures in time-frequency domain with estimated number of sources
Zhang, Haijian
Hua, Guang
Yu, Lei
Cai, Yunlong
Bi, Guoan
SPEECH COMMUNICATION, 2017, 89 : 1 - 16
[26] Segmented Time-Frequency Masking Algorithm for Speech Separation Based on Deep Neural Networks
Guo, Xinyu
Ou, Shifeng
Gao, Meng
Gao, Ying
2020 13TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2020), 2020, : 445 - 450
[27] SPATIAL AND COHERENCE CUES BASED TIME-FREQUENCY MASKING FOR BINAURAL REVERBERANT SPEECH SEPARATION
Alinaghi, Atiyeh
Wang, Wenwu
Jackson, Philip J. B.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 684 - 688
[28] Reverberant speech separation with probabilistic time-frequency masking for B-format recordings
Chen, Xiaoyi
Wang, Wenwu
Wang, Yingmin
Zhong, Xionghu
Alinaghi, Atiyeh
SPEECH COMMUNICATION, 2015, 68 : 41 - 54
[29] Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Luo, Yi
Mesgarani, Nima
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (08) : 1256 - 1266
[30] ACOUSTIC VECTOR SENSOR BASED REVERBERANT SPEECH SEPARATION WITH PROBABILISTIC TIME-FREQUENCY MASKING
Zhong, Xionghu
Chen, Xiaoyi
Wang, Wenwu
Alinaghi, Atiyeh
Premkumar, A. B.
2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,

← 1 2 3 4 5 →