Robustness of HMM-based Speech Synthesis

被引:0
|
作者
Yamagishi, Junichi [1 ]
Ling, Zhenhua [1 ]
King, Simon [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
关键词
speech synthesis; HMM; unit selection; HTS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As speech synthesis techniques become more advanced, we are able to consider building high-quality voices from data collected outside the usual highly-controlled recording studio environment. This presents new challenges that are not present in conventional text-to-speech synthesis: the available speech data are not perfectly clean, the recording conditions are not consistent, and/or the phonetic balance of the material is not ideal. Although a clear picture of the performance of various speech synthesis techniques (e.g., concatenative, HMM-based or hybrid) under good conditions is provided by the Blizzard Challenge, it is not well understood how robust these algorithms are to less favourable conditions. In this paper, we analyse the performance of several speech synthesis methods under such conditions. This is, as far as we know, a new research topic: "Robust speech synthesis." As a consequence of our investigations, we propose a new robust training method for the HMM-based speech synthesis in for use with speech data collected in unfavourable conditions.
引用
收藏
页码:581 / 584
页数:4
相关论文
共 50 条
  • [11] Thousands of Voices for HMM-based Speech Synthesis
    Yamagishi, Junichi
    Usabaev, Bela
    King, Simon
    Watts, Oliver
    Dines, John
    Tian, Jilei
    Hu, Rile
    Guan, Yong
    Oura, Keiichiro
    Tokuda, Keiichi
    Karhila, Reima
    Kurimo, Mikko
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 416 - +
  • [12] Analysis of HMM-Based Lombard Speech Synthesis
    Raitio, Tuomo
    Suni, Antti
    Vainio, Martti
    Alku, Paavo
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2792 - +
  • [13] Speech parameter generation algorithms for HMM-based speech synthesis
    Tokuda, K
    Yoshimura, T
    Masuko, T
    Kobayashi, T
    Kitamura, T
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1315 - 1318
  • [14] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [15] Analysis and HMM-based synthesis of hypo and hyperarticulated speech
    Picart, Benjamin
    Drugman, Thomas
    Dutoit, Thierry
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 687 - 707
  • [16] Optimal Number of States in HMM-Based Speech Synthesis
    Hanzlicek, Zdenek
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
  • [17] Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis
    Andersson, Sebastian
    Yamagishi, Junichi
    Clark, Robert A. J.
    [J]. SPEECH COMMUNICATION, 2012, 54 (02) : 175 - 188
  • [18] A trainable excitation model for HMM-based speech synthesis
    Maia, R.
    Toda, T.
    Zen, H.
    Nankaku, Y.
    Tokuda, K.
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
  • [19] Speaker interpolation for HMM-based speech synthesis system
    [J]. Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):
  • [20] Contextual Additive Structure for HMM-Based Speech Synthesis
    Takaki, Shinji
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 229 - 238