A generalized approach for model-based speaker-dependent single channel speech separation

被引：0

作者：

Radfar, M. H. ^{[1
]}

Sayadiyan, A.

Dansereau, R. M.

机构：

[1] Amirkabir Univ Technol, Dept Elect Engn, Tehran 158754413, Iran

[2] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada

来源：

IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY TRANSACTION B-ENGINEERING | 2007年 / 31卷 / B3期

关键词：

source separation; single channel speech separation; speaker identification; model-based single channel speech separation; Wiener filtering;

D O I：

暂无

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

In this paper, we present a new technique for separating two speech signals received from one microphone or one communication channel. In this special case, the separation problem is too ill-conditioned to be handled with common blind source separation techniques. The proposed technique is a generalized approach to model-based speaker- depen dent single channel speech separation techniques in which a priori knowledge of the underlying speakers is used to separate speech signals. The proposed technique not only preserves the advantages of model-based speaker dependent single channel speech separation algorithms (i.e. high separability), but also is able to separate the speech signals of an unlimited number of speakers given the speakers' models (i.e. generality). The whole algorithm consists of three stages: classification, identification, and separation. The identities of speakers speech signals form the mixed signal are first determined at the classification and identification stages. Identified speakers' model is then used to separate the underlying signals using a novel approach consisting of Gaussian mixture modeling, maximum likelihood estimation and Wiener filtering. Evaluation results conducted on a database consisting of 100 mixed speech signals with target-to-interference ratios (TIR) ranging from -9 dB to +9 dB show significant performance improvements over those techniques which use a single model for separation.

引用

页码：361 / 375

页数：15

共 50 条

[1] Speaker-independent model-based single channel speech separation
Radfar, M. H.
Dansereau, R. M.
Sayadiyan, A.
[J]. NEUROCOMPUTING, 2008, 72 (1-3) : 71 - 78
[2] A TWO-STAGE SINGLE-CHANNEL SPEAKER-DEPENDENT SPEECH SEPARATION APPROACH FOR CHIME-5 CHALLENGE
Sun, Lei
Du, Jun
Gao, Tian
Fang, Yi
Ma, Feng
Pan, Jia
Lee, Chin-Hui
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6650 - 6654
[3] A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech
Yan-Hui Tu
Jun Du
Chin-Hui Lee
[J]. Journal of Signal Processing Systems, 2018, 90 : 963 - 973
[4] A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech
Tu, Yan-Hui
Du, Jun
Lee, Chin-Hui
[J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 963 - 973
[5] A unified DNN approach to speaker-dependent simultaneous speech enhancement and speech separation in low SNR environments
Gao, Tian
Du, Jun
Dai, Li-Rong
Lee, Chin-Hui
[J]. SPEECH COMMUNICATION, 2017, 95 : 28 - 39
[6] GAIN ESTIMATION IN MODEL-BASED SINGLE CHANNEL SPEECH SEPARATION
Radfar, M. H.
Wong, W.
Chan, W-Y.
Dansereau, R. M.
[J]. 2009 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2009, : 423 - +
[7] Speaker Verification Based on Single Channel Speech Separation
Jin, Rong
Ablimit, Mijit
Hamdulla, Askar
[J]. IEEE ACCESS, 2023, 11 : 112631 - 112638
[8] A UNIFIED SPEAKER-DEPENDENT SPEECH SEPARATION AND ENHANCEMENT SYSTEM BASED ON DEEP NEURAL NETWORKS
Gao, Tian
Du, Jun
Xu, Li
Liu, Cong
Dai, Li-Rong
Lee, Chin-Hui
[J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 687 - 691
[9] Speaker-dependent model interpolation for statistical emotional speech synthesis
Hsu, Chih-Yu
Chen, Chia-Ping
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2012, : 1 - 10
[10] Speaker-dependent model interpolation for statistical emotional speech synthesis
Chih-Yu Hsu
Chia-Ping Chen
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2012

← 1 2 3 4 5 →