HMM-Based Cue Parameters Estimation for Speech Enhancement

被引：0

作者：

Deng, Feng ^{[1
]}

Bao, Chang-chun ^{[1
]}

Jia, Mao-shen ^{[1
]}

机构：

[1] Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China

来源：

2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2016年

基金：

中国国家自然科学基金;

关键词：

speech enhancement; HMM; cue parameters; priori information; NOISE;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, a hidden Markov model (HMM)-based cue parameters estimation method for single-channel speech enhancement is proposed, in which the cue parameters of binaural cue coding (BCC) are applied to single-channel speech enhancement system successfully. First, the clean speech and noise signals are considered as the left and right channels of stereo signal, respectively; and the noisy speech is treated as the down-mixed mono signal of BCC method. According to the clean speech and noise data set and the corresponding noisy speech data set, the clean cue parameters and pre-enhanced cue parameters are extracted, respectively. Then the cue HMM is trained offline, which exploits the a priori information about the clean cue parameters and the pre-enhanced cue parameters for speech enhancement. Next, using the trained cue HMM, the clean cue parameters are estimated from noisy speech online. Finally, following the synthesis principle of BCC cue parameters, the speech estimator is constructed for enhancing noisy speech. The test results demonstrate that, for the segmental signal-noise-ratio (SNR), the log spectral distortion and PESQ measures, the proposed method performs better than the reference methods.

引用

页数：4

共 50 条

[41] HMM-based speech enhancement using sub-word models and noise adaptation
Kato, Akihiro
Milner, Ben
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3748 - 3752
[42] Subjective analysis of an HMM-based visual speech synthesizer
Williams, JJ
Katsaggelos, AK
Garstecki, DC
HUMAN VISION AND ELECTRONIC IMAGING VI, 2001, 4299 : 544 - 555
[43] Use of voicing features in HMM-based speech recognition
Thomson, DL
Chengalvarayan, R
SPEECH COMMUNICATION, 2002, 37 (3-4) : 197 - 211
[44] A study of HMM-based bandwidth extension of speech signals
Song, Geun-Bae
Martynovich, Pavel
SIGNAL PROCESSING, 2009, 89 (10) : 2036 - 2044
[45] Multimodal HMM-based NAM-to-speech conversion
Tran, Viet-Anh
Bailly, Gerard
Loevenbruck, Helene
Toda, Tomoki
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 648 - +
[46] Modified Viterbi Scoring for HMM-Based Speech Recognition
Jo, Jihyuck
Kim, Han-Gyu
Park, In-Cheol
Jung, Bang Chul
Yoo, Hoyoung
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2019, 25 (02): : 351 - 358
[47] State duration modeling for HMM-based speech synthesis
Zen, Heiga
Masuko, Takashi
Tokuda, Keiichi
Yoshimura, Takayoshi
Kobayasih, Takao
Kitamura, Tadashi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
[48] Analysis and HMM-based synthesis of hypo and hyperarticulated speech
Picart, Benjamin
Drugman, Thomas
Dutoit, Thierry
COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 687 - 707
[49] Optimal Number of States in HMM-Based Speech Synthesis
Hanzlicek, Zdenek
TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
[50] Normalized training for HMM-based visual speech recognition
Nankaku, Yoshihiko
Tokuda, Keiichi
Kitamura, Tadashi
Kobayashi, Takao
ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (11): : 40 - 50

← 1 2 3 4 5 →