Monaural Voiced Speech Segregation Based on Dynamic Harmonic Function

被引:0
|
作者
Xueliang Zhang
Wenju Liu
Bo Xu
机构
[1] Chinese Academy of Sciences,National Laboratory of Pattern Recognition (NLPR), Institute of Automation
[2] Inner Mongolia University,Computer Science Department
关键词
Harmonic Order; Clean Speech; Complex Tone; Pitch Contour; Pitch Period;
D O I
暂无
中图分类号
学科分类号
摘要
Correlogram is an important representation for periodic signals. It is widely used in pitch estimation and source separation. For these applications, major problems of correlogram are its low resolution and redundant information. This paper proposes a voiced speech segregation system based on a newly introduced concept called dynamic harmonic function (DHF). In the proposed system, conventional correlograms are further processed by replacing the autocorrelation function (ACF) with DHF. The advantages of DHF are: 1) peak's width is adjustable by controlling the variance of the Gaussian function and 2) the invalid peaks of ACF, not at the pitch period, tend to be suppressed. Based on DHF, pitch detection and effective source segregation algorithms are proposed. Our system is systematically evaluated and compared with the correlogram-based system. Both the signal-to-noise ratio results and the perceptual evaluation of speech quality scores show that the proposed system yields substantially better performance.
引用
收藏
相关论文
共 50 条
  • [41] Segregation of voiced and unvoiced components from residual of speech signal
    Jo, Cheol-woo
    Kim, Jae-hee
    [J]. JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2012, 19 (02) : 496 - 503
  • [42] Monaural Speech Segregation Based on Pitch Track Correction Using An Ensemble Kalman Filter
    Kim, Han-Gyu
    Jang, Gil-Jin
    Park, Jeong-Sik
    Oh, Yung-Hwan
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 813 - 816
  • [43] Monaural speech segregation based on fusion of source-driven with model-driven techniques
    Radfar, Mohammad H.
    Dansereau, Richard M.
    Sayadiyan, Abolghasem
    [J]. SPEECH COMMUNICATION, 2007, 49 (06) : 464 - 476
  • [44] Binary mask estimation for voiced speech segregation using Bayesian method
    Liang, Shan
    Liu, Wenju
    [J]. 2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 345 - 349
  • [45] Monaural Auditory-Based Unvoiced Speech Segregation Using SNR-Based Subband Spectral Subtraction
    Geravanchizadeh, Masoud
    Dadvar, Paria
    [J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2014, 100 (02) : 353 - 361
  • [46] Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation
    Han-Gyu Kim
    Gil-Jin Jang
    Yung-Hwan Oh
    Ho-Jin Choi
    [J]. The Journal of Supercomputing, 2020, 76 : 8193 - 8213
  • [47] Speech and music pitch trajectory classification using recurrent neural networks for monaural speech segregation
    Kim, Han-Gyu
    Jang, Gil-Jin
    Oh, Yung-Hwan
    Choi, Ho-Jin
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (10): : 8193 - 8213
  • [48] A Recursive Network with Dynamic Attention for Monaural Speech Enhancement
    Li, Andong
    Zheng, Chengshi
    Fan, Cunhang
    Peng, Renhua
    Li, Xiaodong
    [J]. INTERSPEECH 2020, 2020, : 2422 - 2426
  • [49] Speech enhancement based on a voiced-unvoiced speech model
    Goh, Z
    Tan, KC
    Tan, BTG
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 401 - 404
  • [50] Investigation of Cost Function for Supervised Monaural Speech Separation
    Liu, Yun
    Zhang, Hui
    Zhang, Xueliang
    Cao, Yuhang
    [J]. INTERSPEECH 2019, 2019, : 3178 - 3182