USE OF FUNDAMENTAL FREQUENCIES SHAPED BY GENERATION PROCESS MODEL FOR HMM-BASED SPEECH SYNTHESIS

被引:0
|
作者
Hirose, Keikichi [1 ]
Hashimoto, Hiroya [2 ]
Hyakutake, Kyota [2 ]
Saito, Daisuke [1 ]
Minematsu, Nobuaki [2 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Dept Informat & Commun Engn, Tokyo, Japan
[2] Univ Tokyo, Dept Elect Engn & Informat Syst, Grad Sch Engn, Tokyo, Japan
来源
2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) | 2014年
关键词
Generation process model; HMM-based speech synthesis; F-0; residual; Flexible F-0 control;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Generation process model of fundamental frequency (F-0) contours is known to represent global movements of F-0's keeping a clear relation with linguistic information of utterances. While HMMbased speech synthesis can generate a good quality of speech, problems, which arise from frame-by-frame processing, are pointed out. These problems are expected to be solved by incorporating the model constraints. A method is developed to use F-0 contours approximated by the model for HMM training instead of observed F-0 contours. A clear improvement in the quality of synthetic speech is shown through listening experiments. In the method, fragments of F-0 contours not represented by the model (F-0 residuals) are ignored. A scheme is further introduced to cope with the issue; F-0 residuals are also included in the training and synthesis processes of HMM-based speech synthesis, and the generated F-0 residuals are added to the model-based Fo's before the waveform generation. The model constraint has another merit; relations between generated F-0 contours and texts are clear, and it is possible to add linguistic information such as emphasis to synthetic speech, or to change speaking styles through manipulating Fo's in the F-0 model framework. Several experimental results supporting the advantages of the method are shown.
引用
收藏
页码:555 / 560
页数:6
相关论文
共 50 条
  • [21] An acoustic model adaptation using hmm-based speech synthesis
    Tanaka, K
    Kuroiwa, S
    Tsuge, S
    Ren, F
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 368 - 373
  • [22] Czech HMM-Based Speech Synthesis: Experiments with Model Adaptation
    Hanzlicek, Zdenek
    TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 107 - 114
  • [23] Generation of creaky voice for improving the quality of HMM-based speech synthesis
    Narendra, N. P.
    Rao, K. Sreenivasa
    COMPUTER SPEECH AND LANGUAGE, 2017, 42 : 38 - 58
  • [24] A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
    Toda, Tomoki
    Tokuda, Keiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (05): : 816 - 824
  • [25] SPEECH PARAMETER GENERATION CONSIDERING LSP ORDERING PROPERTY FOR HMM-BASED SPEECH SYNTHESIS
    Qian, Shijun
    Wang, Huanliang
    Pei, Wenjiang
    Zou, Ping
    Wang, Kai
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 330 - 334
  • [26] Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis
    Wen, Zhengqi
    Tao, Jianhua
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1426 - 1429
  • [27] Creation of HMM-based Speech Model for Estonian Text-to-Speech Synthesis
    Nurk, Tonis
    HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 162 - 168
  • [28] A speech parameter generation algorithm using local variance for HMM-based speech synthesis
    Chunwijitra, Vataya
    Nose, Takashi
    Kobayashi, Takao
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1150 - 1153
  • [29] HMM-Based Speech Synthesis for the Greek Language
    Karabetsos, Sotiris
    Tsiakoulis, Pirros
    Chalamandaris, Aimilios
    Raptis, Spyros
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 349 - 356
  • [30] A BAYESIAN APPROACH TO HMM-BASED SPEECH SYNTHESIS
    Hashimoto, Kei
    Zen, Heiga
    Nankaku, Yoshihiko
    Masuko, Takashi
    Tokuda, Keiichi
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4029 - +