A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesis

被引:0
|
作者
Maia, Ranniery
Toda, Tomoki
Tokuda, Keiichi
Sakai, Shinsuke
Nakamura, Satoshi
机构
关键词
speech synthesis; HMM-based speech synthesis; decision tree-based clustering; residual modeling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a decision tree-based algorithm to cluster residual segments assuming an excitation model based on state-dependent filtering of pulse train and white noise. The decision tree construction principle is the same as the one applied to speech recognition. Here parent nodes are split using the residual maximum likelihood criterion. Once these excitation decision trees are constructed for residual signals segmented by full context models, using questions related to the full context of the training sentences, they can be utilized for excitation modeling in speech synthesis based on hidden Markov models (HMM). Experimental results have shown that the algorithm in question is very effective in terms of clustering residual signals given segmentation. pitch marks and full context questions, resulting in filters with good residual modeling properties.
引用
收藏
页码:1743 / 1746
页数:4
相关论文
共 50 条
  • [11] A BAYESIAN APPROACH TO HMM-BASED SPEECH SYNTHESIS
    Hashimoto, Kei
    Zen, Heiga
    Nankaku, Yoshihiko
    Masuko, Takashi
    Tokuda, Keiichi
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4029 - +
  • [12] FULL COVARIANCE STATE DURATION MODELING FOR HMM-BASED SPEECH SYNTHESIS
    Lu, Heng
    Wu, Yi-Jian
    Tokuda, Keiichi
    Dai, Li-Rong
    Wang, Ren-Hua
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4033 - +
  • [13] CROSS-VALIDATION BASED DECISION TREE CLUSTERING FOR HMM-BASED TTS
    Zhang, Yu
    Yan, Zhi-Jie
    Soong, Frank K.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4602 - 4605
  • [14] Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis
    Wen, Zhengqi
    Tao, Jianhua
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1426 - 1429
  • [15] Two-band excitation for HMM-based speech synthesis
    Kim, Sang-Jin
    Hahn, Minsoo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (01) : 378 - 381
  • [16] Statistical model training technique based on speaker clustering approach for HMM-based speech synthesis
    Ijima, Yusuke
    Miyazaki, Noboru
    Mizuno, Hideyuki
    Sakauchi, Sumitaka
    SPEECH COMMUNICATION, 2015, 71 : 50 - 61
  • [17] Improved Training of Excitation for HMM-based Parametric Speech Synthesis
    Shiga, Yoshinori
    Toda, Tomoki
    Sakai, Shinsuke
    Kawai, Hisashi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 809 - 812
  • [18] Analysis of speaker clustering strategies for HMM-based speech synthesis
    Dall, Rasmus
    Veaux, Christophe
    Yamagishi, Junichi
    King, Simon
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 994 - 997
  • [19] Transform Mapping Using Shared Decision Tree Context Clustering for HMM-Based Cross-Lingual Speech Synthesis
    Nagahama, Daiki
    Nose, Takashi
    Koriyama, Tomoki
    Kobayashi, Takao
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 770 - 774
  • [20] Soft context clustering for F0 modeling in HMM-based speech synthesis
    Khorram, Soheil
    Sameti, Hossein
    King, Simon
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,