Monaural voiced speech segregation based on elaborate harmonic grouping strategies

被引:0
|
作者
LIU WenJu 1
2 Digital Media Content Technology Research Center
机构
基金
中国国家自然科学基金;
关键词
computational auditory scene analysis; voiced speech separation; harmonistic principle; minimum amplitude principle; elaborate harmonic grouping strategies;
D O I
暂无
中图分类号
TN912.3 [语音信号处理];
学科分类号
0711 ;
摘要
In this paper, an enhanced algorithm based on several elaborate harmonic grouping strategies for monaural voiced speech segregation is proposed. Main achievements of the proposed algorithm lie in three aspects. Firstly, the algorithm classifies the time-frequency (T-F) units into resolved and unresolved ones by carrier-to-envelope energy ratio, which leads to more accurate classification results than by cross-channel correlation. Secondly, resolved T-F units are grouped together according to minimum amplitude principle, which has been verified to exist in human perception, as well as the harmonic principle. Finally, "enhanced" envelope autocorrelation function is employed to detect amplitude modulation rates, which helps a lot in reducing half-frequency error in grouping of unresolved units. Systematic evaluation and comparison show that performance of separation is greatly improved by the proposed algorithm. Specifically, signal-to-noise ratio (SNR) is improved by 0.96 dB compared with that of previous method. Besides, our algorithm is also effective in improving the PESQ score and subjective perception score.
引用
收藏
页码:2491 / 2500
页数:10
相关论文
共 50 条
  • [41] Monaural Speech Separation Based on a 2D Processing and Harmonic Analysis
    Rabiee, Azam
    Setayeshi, Saeed
    Lee, Soo-Young
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1760 - +
  • [42] Segregation of voiced and unvoiced components from residual of speech signal
    JO Cheol-woo
    KIM Jae-hee
    [J]. Journal of Central South University, 2012, 19 (02) : 496 - 503
  • [43] Segregation of voiced and unvoiced components from residual of speech signal
    Cheol-woo Jo
    Jae-hee Kim
    [J]. Journal of Central South University, 2012, 19 : 496 - 503
  • [44] Segregation of voiced and unvoiced components from residual of speech signal
    Jo, Cheol-woo
    Kim, Jae-hee
    [J]. JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2012, 19 (02) : 496 - 503
  • [45] Monaural Speech Segregation Based on Pitch Track Correction Using An Ensemble Kalman Filter
    Kim, Han-Gyu
    Jang, Gil-Jin
    Park, Jeong-Sik
    Oh, Yung-Hwan
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 813 - 816
  • [46] Cochannel Speech Separation Using Multi-pitch Estimation and Model Based Voiced Sequential Grouping
    Li, Ming
    Cao, Chuan
    Wang, Di
    Lu, Ping
    Fu, Qiang
    Yan, Yonghong
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 151 - 154
  • [47] Monaural speech segregation based on fusion of source-driven with model-driven techniques
    Radfar, Mohammad H.
    Dansereau, Richard M.
    Sayadiyan, Abolghasem
    [J]. SPEECH COMMUNICATION, 2007, 49 (06) : 464 - 476
  • [48] Binary mask estimation for voiced speech segregation using Bayesian method
    Liang, Shan
    Liu, Wenju
    [J]. 2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 345 - 349
  • [49] Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural Localization
    Woodruff, John
    Wang, DeLiang
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1856 - 1866
  • [50] Monaural Auditory-Based Unvoiced Speech Segregation Using SNR-Based Subband Spectral Subtraction
    Geravanchizadeh, Masoud
    Dadvar, Paria
    [J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2014, 100 (02) : 353 - 361