A Two-Stage Approach to Noisy Cochannel Speech Separation with Gated Residual Networks

被引：2

作者：

Tan, Ke ^{[1
]}

Wang, DeLiang ^{[1
,2
]}

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

noisy cochannel speech separation; gated residual networks; ideal ratio mask; denoising; cochannel separation;

D O I：

10.21437/Interspeech.2018-1406

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cochannel speech separation is the task of separating two speech signals from a single mixture. The task becomes even more challenging if the speech mixture is further corrupted by background noise. In this study, we focus on a gender dependent scenario, where target speech is from a male speaker and interfering speech from a female speaker. We propose a two-stage separation strategy to address this problem in a noise-independent way. In the proposed system, denoising and cochannel separation are performed successively by two modules, which are based on a newly-introduced convolutional neural network for speech separation. The evaluation results demonstrate that the proposed system substantially outperforms one-stage baselines in terms of objective intelligibility and perceptual quality.

引用

页码：3484 / 3488

页数：5

共 50 条

[1] A TWO-STAGE ALGORITHM FOR NOISY AND REVERBERANT SPEECH ENHANCEMENT
Zhao, Yan
Wang, Zhong-Qiu
Wang, DeLiang
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5580 - 5584
[2] TWO-STAGE SPEECH ENHANCEMENT USING GATED CONVOLUTIONS
Thieling, Lars
Jax, Peter
[J]. 2022 INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC 2022), 2022,
[3] An Unsupervised Approach to Cochannel Speech Separation
Hu, Ke
Wang, DeLiang
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (01): : 120 - 129
[4] GATED RESIDUAL NETWORKS WITH DILATED CONVOLUTIONS FOR SUPERVISED SPEECH SEPARATION
Tan, Ke
Chen, Jitong
Wang, DeLiang
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 21 - 25
[5] A DATA-DRIVEN RESIDUAL GAIN APPROACH FOR TWO-STAGE SPEECH ENHANCEMENT
Jin, Yu Gwang
Lee, Chul Min
Cho, Kiho
Kim, Nam Soo
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4752 - 4755
[6] Novel Two-Stage Audiovisual Speech Filtering in Noisy Environments
Andrew Abel
Amir Hussain
[J]. Cognitive Computation, 2014, 6 : 200 - 217
[7] Novel Two-Stage Audiovisual Speech Filtering in Noisy Environments
Abel, Andrew
Hussain, Amir
[J]. COGNITIVE COMPUTATION, 2014, 6 (02) : 200 - 217
[8] A Two-stage Approach to Speech Bandwidth Extension
Lin, Ju
Wang, Yun
Kalgaonkar, Kaustubh
Keren, Gil
Zhang, Didi
Fuegen, Christian
[J]. INTERSPEECH 2021, 2021, : 1689 - 1693
[9] Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement
Zhao, Yan
Wang, Zhong-Qiu
Wang, DeLiang
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 53 - 62
[10] Speech separation based on reliable binaural cues with two-stage neural network in noisy-reverberant environments
Li, Ruwei
Li, Tao
Sun, Xiaoyue
Sun, Xingwu
Zhao, Fengnian
[J]. APPLIED ACOUSTICS, 2020, 168

← 1 2 3 4 5 →