An iterative model-based approach to cochannel speech separation

被引：18

作者：

Hu, Ke ^{[1
]}

Wang, DeLiang ^{[1
,2
]}

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2013年

关键词：

AUDITORY SCENE ANALYSIS; SEQUENTIAL ORGANIZATION; CHANNEL; RECOGNITION; SEGREGATION; SYSTEM;

D O I：

10.1186/1687-4722-2013-14

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Cochannel speech separation aims to separate two speech signals from a single mixture. In a supervised scenario, the identities of two speakers are given, and current methods use pre-trained speaker models for separation. One issue in model-based methods is the mismatch between training and test signal levels. We propose an iterative algorithm to adapt speaker models to match the signal levels in testing. Our algorithm first obtains initial estimates of source signals using unadapted speaker models and then detects the input signal-to-noise ratio (SNR) of the mixture. The input SNR is then used to adapt the speaker models for more accurate estimation. The two steps iterate until convergence. Compared to search-based SNR detection methods, our method is not limited to given SNR levels. Evaluations demonstrate that the iterative procedure converges quickly in a considerable range of SNRs and improves separation results significantly. Comparisons show that the proposed system performs significantly better than related model-based systems.

引用

页数：11

共 50 条

[21] SVM-BASED SEPARATION OF UNVOICED-VOICED SPEECH IN COCHANNEL CONDITIONS
Hu, Ke
Wang, DeLiang
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4545 - 4548
[22] A model-based iterative approach for the parallelism and gap control of two platforms
Wang, Yen-Po
Tien, Szu-Chi
[J]. TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2014, 36 (05) : 654 - 661
[23] MARKOV MODEL-BASED PHONEME CLASS PARTITIONING FOR IMPROVED CONSTRAINED ITERATIVE SPEECH ENHANCEMENT
HANSEN, JHL
ARSLAN, LM
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 98 - 104
[24] A FILM MODEL-BASED APPROACH FOR SIMULATION OF MULTICOMPONENT REACTIVE SEPARATION
KENIG, E
GORAK, A
[J]. CHEMICAL ENGINEERING AND PROCESSING-PROCESS INTENSIFICATION, 1995, 34 (02) : 97 - 103
[25] Model-based iterative control design
Albertos, P
Esparza, A
Romero, J
[J]. PROCEEDINGS OF THE 2000 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2000, : 2578 - 2582
[26] A Model-Based Iterative Reconstruction Approach to Tunable Diode Laser Absorption Tomography
Nadir, Zeeshan
Brown, Michael S.
Comer, Mary L.
Bouman, Charles A.
[J]. IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2017, 3 (04): : 876 - 890
[27] RECURRENT NEURAL NETWORKS FOR COCHANNEL SPEECH SEPARATION IN REVERBERANT ENVIRONMENTS
Delfarah, Masood
Wang, DeLiang
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5404 - 5408
[28] Performance Evaluation for Transform Domain Model-based Single-channel Speech Separation
Mowlaee, Pejman
Sayadiyan, Abolghasem
[J]. 2009 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2009, : 935 - 942
[29] Model-based design of speech interfaces
Berti, S
Paternò, F
[J]. INTERACTIVE SYSTEMS: DESIGN, SPECIFICATION, AND VERIFICATION, 2003, 2844 : 231 - 244
[30] Performance evaluation of three features for model-based single channel speech separation problem
Radfar, M. H.
Dansereau, R. M.
Sayadiyan, A.
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2610 - +

← 1 2 3 4 5 →