Robustness of HMM-based Speech Synthesis

被引:0
|
作者
Yamagishi, Junichi [1 ]
Ling, Zhenhua [1 ]
King, Simon [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
关键词
speech synthesis; HMM; unit selection; HTS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As speech synthesis techniques become more advanced, we are able to consider building high-quality voices from data collected outside the usual highly-controlled recording studio environment. This presents new challenges that are not present in conventional text-to-speech synthesis: the available speech data are not perfectly clean, the recording conditions are not consistent, and/or the phonetic balance of the material is not ideal. Although a clear picture of the performance of various speech synthesis techniques (e.g., concatenative, HMM-based or hybrid) under good conditions is provided by the Blizzard Challenge, it is not well understood how robust these algorithms are to less favourable conditions. In this paper, we analyse the performance of several speech synthesis methods under such conditions. This is, as far as we know, a new research topic: "Robust speech synthesis." As a consequence of our investigations, we propose a new robust training method for the HMM-based speech synthesis in for use with speech data collected in unfavourable conditions.
引用
收藏
页码:581 / 584
页数:4
相关论文
共 50 条
  • [41] Speaker adaptation of pitch and spectrum for HMM-based speech synthesis
    [J]. Tamura, M, 1600, John Wiley and Sons Inc. (35):
  • [42] CONTEXTUAL PARTIAL ADDITIVE STRUCTURE FOR HMM-BASED SPEECH SYNTHESIS
    Takaki, Shinji
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7878 - 7882
  • [43] Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis
    Gao, Weixun
    Cao, Qiying
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (04) : 1149 - 1166
  • [44] FACTOR ANALYZED VOICE MODELS FOR HMM-BASED SPEECH SYNTHESIS
    Kazumi, Kyosuke
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4234 - 4237
  • [45] Data Selection and Adaptation for Naturalness in HMM-based Speech Synthesis
    Cooper, Erica
    Chang, Alison
    Levitan, Yocheved
    Hirschberg, Julia
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 357 - +
  • [46] Resonance-based spectral deformation in HMM-based speech synthesis
    Spoken Language Communication Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology, Kyoto, Japan
    不详
    [J]. Int. Symp. Chin. Spoken Lang. Process., ISCSLP, (88-92):
  • [47] Implementation and Evaluation of an HMM-based Thai Speech Synthesis System
    Chomphan, Suphattharachai
    Kobayashi, Takao
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 173 - 176
  • [48] Minimum generation error training for HMM-based speech synthesis
    Wu, Yi-Jian
    Wang, Ren-Hua
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 89 - 92
  • [49] HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering
    Raitio, Tuomo
    Suni, Antti
    Yamagishi, Junichi
    Pulakka, Hannu
    Nurminen, Jani
    Vainio, Martti
    Alku, Paavo
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (01): : 153 - 165
  • [50] Extended Decision Tree with OR Relationship for HMM-based Speech Synthesis
    Wang, Yang
    Tao, Jianhua
    Yang, Minghao
    Li, Ya
    [J]. 2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 225 - 229