AN OPTIMIZATION ALGORITHM OF INDEPENDENT MEAN AND VARIANCE PARAMETER TYING STRUCTURES FOR HMM-BASED SPEECH SYNTHESIS

被引:0
|
作者
Takaki, Shinji [1 ]
Oura, Keiichiro [1 ]
Nankaku, Yoshihiko [1 ]
Tokuda, Keiichi [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Nagoya, Aichi 4668555, Japan
关键词
speech synthesis; hidden Markov models; decision trees; context clustering;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a technique for constructing independent parameter tying structures of mean and variance in HMM-based speech synthesis. Conventionally, mean and variance parameters are assumed to have the same tying structure. However, it has been reported that a clustering technique of mean vectors while tying all variance matrices improves the quality of synthesized speech. This indicates that mean and variance parameters should have different optimal tying structures. In the proposed technique, the decision trees for mean and variance parameters are simultaneously grown by taking into account the dependency on mean and variance parameters. Experimental results show that the proposed technique outperforms the conventional one.
引用
收藏
页码:4700 / 4703
页数:4
相关论文
共 50 条
  • [31] Analysis of HMM-Based Lombard Speech Synthesis
    Raitio, Tuomo
    Suni, Antti
    Vainio, Martti
    Alku, Paavo
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2792 - +
  • [32] An improved training algorithm in HMM-based speech recognition
    Li, GJ
    Huong, TY
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1057 - 1060
  • [33] Minimum Kullback-Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis
    Ling, Zhen-Hua
    Dai, Li-Rong
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (05): : 1492 - 1502
  • [34] An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech Synthesis
    Takamichi, Shinnosuke
    Toda, Tomoki
    Shiga, Yoshinori
    Kawai, Hisashi
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1138 - 1141
  • [35] Minimum generation error criterion considering global/local variance for HMM-based speech synthesis
    Wu, Yi-Jian
    Zen, Heiga
    Nankaku, Yoshilliko
    Tokuda, Keiichi
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4621 - 4624
  • [36] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [37] Analysis and HMM-based synthesis of hypo and hyperarticulated speech
    Picart, Benjamin
    Drugman, Thomas
    Dutoit, Thierry
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 687 - 707
  • [38] Optimal Number of States in HMM-Based Speech Synthesis
    Hanzlicek, Zdenek
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
  • [39] Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis
    Andersson, Sebastian
    Yamagishi, Junichi
    Clark, Robert A. J.
    [J]. SPEECH COMMUNICATION, 2012, 54 (02) : 175 - 188
  • [40] Speaker interpolation for HMM-based speech synthesis system
    [J]. Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):