An iterative model-based approach to cochannel speech separation

被引:18
|
作者
Hu, Ke [1 ]
Wang, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
关键词
AUDITORY SCENE ANALYSIS; SEQUENTIAL ORGANIZATION; CHANNEL; RECOGNITION; SEGREGATION; SYSTEM;
D O I
10.1186/1687-4722-2013-14
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Cochannel speech separation aims to separate two speech signals from a single mixture. In a supervised scenario, the identities of two speakers are given, and current methods use pre-trained speaker models for separation. One issue in model-based methods is the mismatch between training and test signal levels. We propose an iterative algorithm to adapt speaker models to match the signal levels in testing. Our algorithm first obtains initial estimates of source signals using unadapted speaker models and then detects the input signal-to-noise ratio (SNR) of the mixture. The input SNR is then used to adapt the speaker models for more accurate estimation. The two steps iterate until convergence. Compared to search-based SNR detection methods, our method is not limited to given SNR levels. Evaluations demonstrate that the iterative procedure converges quickly in a considerable range of SNRs and improves separation results significantly. Comparisons show that the proposed system performs significantly better than related model-based systems.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] SVM-BASED SEPARATION OF UNVOICED-VOICED SPEECH IN COCHANNEL CONDITIONS
    Hu, Ke
    Wang, DeLiang
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4545 - 4548
  • [22] A model-based iterative approach for the parallelism and gap control of two platforms
    Wang, Yen-Po
    Tien, Szu-Chi
    [J]. TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2014, 36 (05) : 654 - 661
  • [23] MARKOV MODEL-BASED PHONEME CLASS PARTITIONING FOR IMPROVED CONSTRAINED ITERATIVE SPEECH ENHANCEMENT
    HANSEN, JHL
    ARSLAN, LM
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 98 - 104
  • [24] A FILM MODEL-BASED APPROACH FOR SIMULATION OF MULTICOMPONENT REACTIVE SEPARATION
    KENIG, E
    GORAK, A
    [J]. CHEMICAL ENGINEERING AND PROCESSING-PROCESS INTENSIFICATION, 1995, 34 (02) : 97 - 103
  • [25] Model-based iterative control design
    Albertos, P
    Esparza, A
    Romero, J
    [J]. PROCEEDINGS OF THE 2000 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2000, : 2578 - 2582
  • [26] A Model-Based Iterative Reconstruction Approach to Tunable Diode Laser Absorption Tomography
    Nadir, Zeeshan
    Brown, Michael S.
    Comer, Mary L.
    Bouman, Charles A.
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2017, 3 (04): : 876 - 890
  • [27] RECURRENT NEURAL NETWORKS FOR COCHANNEL SPEECH SEPARATION IN REVERBERANT ENVIRONMENTS
    Delfarah, Masood
    Wang, DeLiang
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5404 - 5408
  • [28] Performance Evaluation for Transform Domain Model-based Single-channel Speech Separation
    Mowlaee, Pejman
    Sayadiyan, Abolghasem
    [J]. 2009 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2009, : 935 - 942
  • [29] Model-based design of speech interfaces
    Berti, S
    Paternò, F
    [J]. INTERACTIVE SYSTEMS: DESIGN, SPECIFICATION, AND VERIFICATION, 2003, 2844 : 231 - 244
  • [30] Performance evaluation of three features for model-based single channel speech separation problem
    Radfar, M. H.
    Dansereau, R. M.
    Sayadiyan, A.
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2610 - +