A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesis

被引:0
|
作者
Maia, Ranniery
Toda, Tomoki
Tokuda, Keiichi
Sakai, Shinsuke
Nakamura, Satoshi
机构
关键词
speech synthesis; HMM-based speech synthesis; decision tree-based clustering; residual modeling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a decision tree-based algorithm to cluster residual segments assuming an excitation model based on state-dependent filtering of pulse train and white noise. The decision tree construction principle is the same as the one applied to speech recognition. Here parent nodes are split using the residual maximum likelihood criterion. Once these excitation decision trees are constructed for residual signals segmented by full context models, using questions related to the full context of the training sentences, they can be utilized for excitation modeling in speech synthesis based on hidden Markov models (HMM). Experimental results have shown that the algorithm in question is very effective in terms of clustering residual signals given segmentation. pitch marks and full context questions, resulting in filters with good residual modeling properties.
引用
收藏
页码:1743 / 1746
页数:4
相关论文
共 50 条
  • [1] Decision Tree-based Clustering with Outlier Detection for HMM-based Speech Synthesis
    Oh, Kyung Hwan
    Sung, June Sig
    Hong, Doo Hwa
    Kim, Nam Soo
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 108 - +
  • [2] On the state definition for a trainable excitation model in HMM-based speech synthesis
    Maia, R.
    Toda, T.
    Tokuda, K.
    Sakai, S.
    Nakamura, S.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3965 - 3968
  • [3] Extended Decision Tree with OR Relationship for HMM-based Speech Synthesis
    Wang, Yang
    Tao, Jianhua
    Yang, Minghao
    Li, Ya
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 225 - 229
  • [4] Excitation Modeling Based on Waveform Interpolation for HMM-based Speech Synthesis
    Sung, June Sig
    Hong, Doo Hwa
    Oh, Kyung Hwan
    Kim, Nam Soo
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 813 - 816
  • [5] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [6] Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis
    Sung, June Sig
    Hong, Doo Hwa
    Koo, Hyun Woo
    Kim, Nam Soo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (02): : 379 - 382
  • [7] Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis
    Yamagishi, J
    Tachibana, M
    Masuko, T
    Kobayashi, T
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 5 - 8
  • [8] Excitation Modeling for HMM-based Speech Synthesis Based on Principal Component Analysis
    Narendra, N. P.
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
  • [9] Excitation Modeling Method Based on Inverse Filtering for HMM-Based Speech Synthesis
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    MACHINE INTELLIGENCE AND SIGNAL ANALYSIS, 2019, 748 : 85 - 91
  • [10] A trainable excitation model for HMM-based speech synthesis
    Maia, R.
    Toda, T.
    Zen, H.
    Nankaku, Y.
    Tokuda, K.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +