CROSS VALIDATION AND MINIMUM GENERATION ERROR FOR IMPROVED MODEL CLUSTERING IN HMM-BASED TTS

被引:0
|
作者
Xie, Feng-Long [1 ]
Wu, Yi-Jian [1 ]
Soong, Frank K. [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
cross validation; minimum generation error; context clustering; HMM-based synthesis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In HMM-based speech synthesis, context-dependent hidden Markov model (HMM) is widely used for its capability to synthesize highly intelligible and fairly smooth speech. However, to train HMMs of all possible contexts well is difficult, or even impossible, due to the intrinsic, insufficient training data coverage problem. As a result, thus trained models may over fit and their capability in predicting any unseen context in test is highly restricted. Recently cross-validation (CV) has been explored and applied to the decision tree-based clustering with the Maximum-Likelihood (ML) criterion and showed improved robustness in TTS synthesis. In this paper we generalize CV to decision tree clustering but with a different, Minimum Generation Error (MGE), criterion. Experimental results show that the generalization to MGE results in better TTS synthesis performance than that of the baseline systems.
引用
收藏
页码:60 / 63
页数:4
相关论文
共 50 条
  • [1] CROSS-VALIDATION BASED DECISION TREE CLUSTERING FOR HMM-BASED TTS
    Zhang, Yu
    Yan, Zhi-Jie
    Soong, Frank K.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4602 - 4605
  • [2] An improved minimum generation error based model adaptation for HMM-based speech synthesis
    Wu, Yi-Jian
    Qin, Long
    Tokuda, Keiichi
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1727 - +
  • [3] A Minimum V/U Error Approach to F0 Generation in HMM-based TTS
    Qian, Yao
    Soong, Frank
    Wang, Miaomiao
    Wu, Zhizheng
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 400 - 403
  • [4] Minimum generation error training for HMM-based speech synthesis
    Wu, Yi-Jian
    Wang, Ren-Hua
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 89 - 92
  • [5] Decision Tree Based Context Clustering with Cross Likelihood Ratio for HMM-based TTS
    Jung, Chi-Sang
    Kang, Hong-Goo
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2013, 32 (02): : 174 - 180
  • [6] Model Adaptation for HMM-Based Speech Synthesis under Minimum Generation Error Criterion
    Qin, Long
    Wu, Yi-Jian
    Ling, Zhen-Hua
    Wang, Ren-Hua
    ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 539 - +
  • [7] Minimum generation error linear regression based model adaptation for HMM-based speech synthesis
    Qin, Long
    Wu, Yi-Jian
    Ling, Zhen-Hua
    Wang, Ren-Hua
    Da, Li-Rong
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3953 - +
  • [8] Minimum generation error based optimization of HMM model clustering for speech synthesis
    Lu, Heng
    Ling, Zhen-Hua
    Lei, Ming
    Dai, Li-Rong
    Wang, Ren-Hua
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2010, 23 (06): : 822 - 828
  • [9] Sinusoidal model parameterization for HMM-based TTS system
    Shechtman, Slava
    Sorin, Alex
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 805 - 808
  • [10] IMPROVED MODELING FOR F0 GENERATION AND V/U DECISION IN HMM-BASED TTS
    Zhang, Qingqing
    Soong, Frank
    Qian, Yao
    Yan, Zhijie
    Pan, Jielin
    Yan, Yonghong
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4606 - 4609