Monaural voiced speech segregation based on elaborate harmonic grouping strategies

被引：0

作者：

LIU WenJu 1

2 Digital Media Content Technology Research Center

机构：

来源：

Science China(Information Sciences) | 2011年 / 54卷 / 12期

基金：

中国国家自然科学基金;

关键词：

computational auditory scene analysis; voiced speech separation; harmonistic principle; minimum amplitude principle; elaborate harmonic grouping strategies;

D O I：

暂无

中图分类号：

TN912.3 [语音信号处理];

学科分类号：

0711 ;

摘要：

In this paper, an enhanced algorithm based on several elaborate harmonic grouping strategies for monaural voiced speech segregation is proposed. Main achievements of the proposed algorithm lie in three aspects. Firstly, the algorithm classifies the time-frequency (T-F) units into resolved and unresolved ones by carrier-to-envelope energy ratio, which leads to more accurate classification results than by cross-channel correlation. Secondly, resolved T-F units are grouped together according to minimum amplitude principle, which has been verified to exist in human perception, as well as the harmonic principle. Finally, "enhanced" envelope autocorrelation function is employed to detect amplitude modulation rates, which helps a lot in reducing half-frequency error in grouping of unresolved units. Systematic evaluation and comparison show that performance of separation is greatly improved by the proposed algorithm. Specifically, signal-to-noise ratio (SNR) is improved by 0.96 dB compared with that of previous method. Besides, our algorithm is also effective in improving the PESQ score and subjective perception score.

引用

页码：2491 / 2500

页数：10

共 50 条

[41] Monaural Speech Separation Based on a 2D Processing and Harmonic Analysis
Rabiee, Azam
Setayeshi, Saeed
Lee, Soo-Young
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1760 - +
[42] Segregation of voiced and unvoiced components from residual of speech signal
JO Cheol-woo
KIM Jae-hee
[J]. Journal of Central South University, 2012, 19 (02) : 496 - 503
[43] Segregation of voiced and unvoiced components from residual of speech signal
Cheol-woo Jo
Jae-hee Kim
[J]. Journal of Central South University, 2012, 19 : 496 - 503
[44] Segregation of voiced and unvoiced components from residual of speech signal
Jo, Cheol-woo
Kim, Jae-hee
[J]. JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2012, 19 (02) : 496 - 503
[45] Monaural Speech Segregation Based on Pitch Track Correction Using An Ensemble Kalman Filter
Kim, Han-Gyu
Jang, Gil-Jin
Park, Jeong-Sik
Oh, Yung-Hwan
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 813 - 816
[46] Cochannel Speech Separation Using Multi-pitch Estimation and Model Based Voiced Sequential Grouping
Li, Ming
Cao, Chuan
Wang, Di
Lu, Ping
Fu, Qiang
Yan, Yonghong
[J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 151 - 154
[47] Monaural speech segregation based on fusion of source-driven with model-driven techniques
Radfar, Mohammad H.
Dansereau, Richard M.
Sayadiyan, Abolghasem
[J]. SPEECH COMMUNICATION, 2007, 49 (06) : 464 - 476
[48] Binary mask estimation for voiced speech segregation using Bayesian method
Liang, Shan
Liu, Wenju
[J]. 2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 345 - 349
[49] Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural Localization
Woodruff, John
Wang, DeLiang
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1856 - 1866
[50] Monaural Auditory-Based Unvoiced Speech Segregation Using SNR-Based Subband Spectral Subtraction
Geravanchizadeh, Masoud
Dadvar, Paria
[J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2014, 100 (02) : 353 - 361

← 1 2 3 4 5 →