DEEP CASA FOR TALKER-INDEPENDENT MONAURAL SPEECH SEPARATION

被引:0
|
作者
Liu, Yuzhou [1 ]
Delfarah, Masood [1 ]
Wang, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH USA
关键词
Monaural speech separation; speech enhancement; speaker separation; deep CASA;
D O I
10.1109/icassp40776.2020.9054572
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Monaural speech separation is the task of separating target speech from interference in single-channel recordings. Although substantial progress has been made recently in deep learning based speech separation, previous studies usually focus on a single type of interference, either background noise or competing speakers. In this study, we address both speech and nonspeech interference, i.e., monaural speaker separation in noise, in a talker-independent fashion. We extend a recently proposed deep CASA system to deal with noisy speaker mixtures. To facilitate speech enhancement, a denoising module is added to deep CASA as a front-end processor. The proposed systems achieve state-of-the-art results on a benchmark noisy two-speaker separation dataset. The denoising module leads to substantial performance gain across various noise types, and even better generalization in noise-free conditions.
引用
收藏
页码:6354 / 6358
页数:5
相关论文
共 50 条
  • [1] Causal Deep CASA for Monaural Talker-Independent Speaker Separation
    Liu, Yuzhou
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2109 - 2118
  • [2] Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation
    Liu, Yuzhou
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2092 - 2102
  • [3] TALKER-INDEPENDENT SPEECH RECOGNITION IN COMMERCIAL ENVIRONMENTS
    MOSHIER, S
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 : S132 - S132
  • [4] TALKER-INDEPENDENT SPEAKER SEPARATION IN REVERBERANT CONDITIONS
    Delfarah, Masood
    Liu, Yuzhou
    Wang, DeLiang
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8723 - 8727
  • [5] A two-stage deep learning algorithm for talker-independent speaker separation in reverberant conditions
    Delfarah, Masood
    Liu, Yuzhou
    Wang, DeLiang
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 148 (03): : 1157 - 1168
  • [6] Monaural speech separation based on MAXVQ and CASA for robust speech recognition
    Li, Peng
    Guan, Yong
    Wang, Shijin
    Xu, Bo
    Liu, Wenju
    [J]. COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01): : 30 - 44
  • [7] EVIDENCE OF TALKER-INDEPENDENT INFORMATION FOR VOWELS
    VERBRUGGE, RR
    RAKERD, B
    [J]. LANGUAGE AND SPEECH, 1986, 29 : 39 - 57
  • [8] TALKER RECOGNITION IN TANDEM WITH TALKER-INDEPENDENT ISOLATED WORD RECOGNITION
    ROSENBERG, AE
    SHIRLEY, KL
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (03): : 574 - 586
  • [9] Robust talker-independent audio document retrieval
    Jones, GJF
    Foote, JT
    Jones, KS
    Young, SJ
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 311 - 314
  • [10] DEEP LEARNING FOR MONAURAL SPEECH SEPARATION
    Huang, Po-Sen
    Kim, Minje
    Hasegawa-Johnson, Mark
    Smaragdis, Paris
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,