Cochannel Speech Separation Using Multi-pitch Estimation and Model Based Voiced Sequential Grouping

被引：0

作者：

Li, Ming ^{[1
]}

Cao, Chuan ^{[1
]}

Wang, Di ^{[1
]}

Lu, Ping ^{[1
]}

Fu, Qiang ^{[1
]}

Yan, Yonghong ^{[1
]}

机构：

[1] Chinese Acad Sci, ThinkIT Speech Lab, Inst Acoust, Beijing 100190, Peoples R China

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

Auditory scene analysis; cochannel speech; multi-pitch estimation; sequential grouping;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a new cochannel speech separation algorithm using multi-pitch extraction and speaker model based sequential grouping is proposed. After auditory segmentation based on onset and offset analysis, robust multi-pitch estimation algorithm is performed on each segment and the corresponding voiced portions are segregated. Then speaker pair model based on support vector machine (SVM) is employed to determine the optimal sequential grouping alignments and group the speaker homogeneous segments into pure speaker streams. Systematic evaluation on the speech separation challenge database shows significant improvement over the baseline performance.

引用

页码：151 / 154

页数：4

共 50 条

[31] PITCH TRACKING FOR MODEL-BASED SPEECH SEPARATION
Lee, S. W.
Soong, Frank K.
Ching, P. C.
Lee, Tan
[J]. 2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 145 - 148
[32] Informing Piano Multi-Pitch Estimation with Inferred Local Polyphony Based on Convolutional Neural Networks
Taenzer, Michael
Mimilakis, Stylianos I.
Abesser, Jakob
[J]. ELECTRONICS, 2021, 10 (07)
[33] TIME-RECURSIVE MULTI-PITCH ESTIMATION USING GROUP SPARSE RECURSIVE LEAST SQUARES
Elvander, Filip
Sward, Johan
Jakobsson, Andreas
[J]. 2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 369 - 373
[34] Estimation of pitch of noisy speech using ar model based inverse filtering
Ahmed, Kazi Jamir Uddin
khan, Md. Rezwan
[J]. ICECE 2006: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, 2006, : 447 - +
[35] Multi-pitch estimation based on multi-scale product analysis, improved comb filter and dynamic programming
Zeremdini J.
Messaoud M.A.B.
Bouzid A.
[J]. International Journal of Speech Technology, 2017, 20 (02) : 225 - 237
[36] Fast HOS based simultaneous voiced/unvoiced detection and pitch estimation using 3-level binary speech signals
Alkulaibi, A
Soraghan, JJ
Durrani, TS
[J]. 8TH IEEE SIGNAL PROCESSING WORKSHOP ON STATISTICAL SIGNAL AND ARRAY PROCESSING, PROCEEDINGS, 1996, : 194 - 197
[37] Second generation wavelet transform-based pitch period estimation and voiced/unvoiced decision for speech signals
Erçelebi, E
[J]. APPLIED ACOUSTICS, 2003, 64 (01) : 25 - 41
[38] Model-Based Estimation of Instantaneous Pitch in Noisy Speech
Hong, Jung Ook
Wolfe, Patrick J.
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 100 - 103
[39] A Pairwise Algorithm Using the Deep Stacking Network for Speech Separation and Pitch Estimation
Zhang, Xueliang
Zhang, Hui
Nie, Shuai
Gao, Guanglai
Liu, Wenju
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) : 1066 - 1078
[40] Single-channel speech separation using empirical mode decomposition and multi pitch information with estimation of number of speakers
Prasanna Kumar M.K.
Kumaraswamy R.
[J]. International Journal of Speech Technology, 2017, 20 (01) : 109 - 125

← 1 2 3 4 5 →