HMM-Based Cue Parameters Estimation for Speech Enhancement

被引:0
|
作者
Deng, Feng [1 ]
Bao, Chang-chun [1 ]
Jia, Mao-shen [1 ]
机构
[1] Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
基金
中国国家自然科学基金;
关键词
speech enhancement; HMM; cue parameters; priori information; NOISE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, a hidden Markov model (HMM)-based cue parameters estimation method for single-channel speech enhancement is proposed, in which the cue parameters of binaural cue coding (BCC) are applied to single-channel speech enhancement system successfully. First, the clean speech and noise signals are considered as the left and right channels of stereo signal, respectively; and the noisy speech is treated as the down-mixed mono signal of BCC method. According to the clean speech and noise data set and the corresponding noisy speech data set, the clean cue parameters and pre-enhanced cue parameters are extracted, respectively. Then the cue HMM is trained offline, which exploits the a priori information about the clean cue parameters and the pre-enhanced cue parameters for speech enhancement. Next, using the trained cue HMM, the clean cue parameters are estimated from noisy speech online. Finally, following the synthesis principle of BCC cue parameters, the speech estimator is constructed for enhancing noisy speech. The test results demonstrate that, for the segmental signal-noise-ratio (SNR), the log spectral distortion and PESQ measures, the proposed method performs better than the reference methods.
引用
收藏
页数:4
相关论文
共 50 条
  • [41] HMM-based speech enhancement using sub-word models and noise adaptation
    Kato, Akihiro
    Milner, Ben
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3748 - 3752
  • [42] Subjective analysis of an HMM-based visual speech synthesizer
    Williams, JJ
    Katsaggelos, AK
    Garstecki, DC
    HUMAN VISION AND ELECTRONIC IMAGING VI, 2001, 4299 : 544 - 555
  • [43] Use of voicing features in HMM-based speech recognition
    Thomson, DL
    Chengalvarayan, R
    SPEECH COMMUNICATION, 2002, 37 (3-4) : 197 - 211
  • [44] A study of HMM-based bandwidth extension of speech signals
    Song, Geun-Bae
    Martynovich, Pavel
    SIGNAL PROCESSING, 2009, 89 (10) : 2036 - 2044
  • [45] Multimodal HMM-based NAM-to-speech conversion
    Tran, Viet-Anh
    Bailly, Gerard
    Loevenbruck, Helene
    Toda, Tomoki
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 648 - +
  • [46] Modified Viterbi Scoring for HMM-Based Speech Recognition
    Jo, Jihyuck
    Kim, Han-Gyu
    Park, In-Cheol
    Jung, Bang Chul
    Yoo, Hoyoung
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2019, 25 (02): : 351 - 358
  • [47] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [48] Analysis and HMM-based synthesis of hypo and hyperarticulated speech
    Picart, Benjamin
    Drugman, Thomas
    Dutoit, Thierry
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 687 - 707
  • [49] Optimal Number of States in HMM-Based Speech Synthesis
    Hanzlicek, Zdenek
    TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
  • [50] Normalized training for HMM-based visual speech recognition
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    Kitamura, Tadashi
    Kobayashi, Takao
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (11): : 40 - 50