Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis

被引:0
|
作者
Tachibana, Makoto [1 ]
Izawa, Shinsuke [1 ]
Nose, Takashi [1 ]
Kobayashi, Takao [1 ]
机构
[1] Tokyo Inst Technol, Interdisciplinary Grad Sch Sci & Engn, Yokohama, Kanagawa 2268502, Japan
关键词
expressive speech synthesis; style control; hidden Markov model; speaker adaptation; average voice model;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a technique for synthesizing speech with desired style expressivity of an arbitrary target speaker's voice. In an MLLR-based speaker adaptation technique for multiple regression hidden semi-Markov model (MRHSMM), the quality of synthesized speech crucially depends on the initial MRHSMM trained from a certain source speaker's data and it is not always possible to synthesize natural sounding speech with a given target speaker's voice. To overcome this problem, we perform simultaneous adaptation of speaker and style from an average voice model. Experimental results show that the proposed technique provides more natural sounding speech than the conventional one with speaker adaptation only.
引用
收藏
页码:4633 / 4636
页数:4
相关论文
共 50 条
  • [1] HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation
    Nose, Takashi
    Tachibana, Makoto
    Kobayashi, Takao
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (03): : 489 - 497
  • [2] Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
    Yamagishi, Junichi
    Watts, Oliver
    King, Simon
    Usabaev, Bela
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 418 - +
  • [3] SPEAKER-INDEPENDENT STYLE CONVERSION FOR HMM-BASED EXPRESSIVE SPEECH SYNTHESIS
    Kanagawa, Hiroki
    Nose, Takashi
    Kobayashi, Takao
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7864 - 7868
  • [4] A style control technique for HMM-based expressive speech synthesis
    Nose, Takashi
    Yamagishi, Junichi
    Masuko, Takashi
    Kobayashi, Takao
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (09) : 1406 - 1413
  • [5] HMM-based Speaker Characteristics Emphasis Using Average Voice Model
    Nose, Takashi
    Adada, Junichi
    Kobayashi, Takao
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2599 - 2602
  • [6] A training method of average voice model for HMM-based speech synthesis
    Yamagishi, J
    Tamura, M
    Masuko, T
    Tokuda, K
    Kobayashi, T
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2003, E86A (08) : 1956 - 1963
  • [7] Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis
    Yamagishi, J
    Tachibana, M
    Masuko, T
    Kobayashi, T
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 5 - 8
  • [8] Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis
    Gao, Weixun
    Cao, Qiying
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (04) : 1149 - 1166
  • [9] Speaker Adaptation using Nonlinear Regression Techniques for HMM-based Speech Synthesis
    Hong, Doo Hwa
    Kang, Shin Jae
    Lee, Joun Yeop
    Kim, Nam Soo
    [J]. 2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 586 - 589
  • [10] An acoustic model adaptation using hmm-based speech synthesis
    Tanaka, K
    Kuroiwa, S
    Tsuge, S
    Ren, F
    [J]. 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 368 - 373