Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis

被引：0

作者：

Tachibana, Makoto ^{[1
]}

Izawa, Shinsuke ^{[1
]}

Nose, Takashi ^{[1
]}

Kobayashi, Takao ^{[1
]}

机构：

[1] Tokyo Inst Technol, Interdisciplinary Grad Sch Sci & Engn, Yokohama, Kanagawa 2268502, Japan

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年

关键词：

expressive speech synthesis; style control; hidden Markov model; speaker adaptation; average voice model;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose a technique for synthesizing speech with desired style expressivity of an arbitrary target speaker's voice. In an MLLR-based speaker adaptation technique for multiple regression hidden semi-Markov model (MRHSMM), the quality of synthesized speech crucially depends on the initial MRHSMM trained from a certain source speaker's data and it is not always possible to synthesize natural sounding speech with a given target speaker's voice. To overcome this problem, we perform simultaneous adaptation of speaker and style from an average voice model. Experimental results show that the proposed technique provides more natural sounding speech than the conventional one with speaker adaptation only.

引用

页码：4633 / 4636

页数：4

共 50 条

[1] HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation
Nose, Takashi
Tachibana, Makoto
Kobayashi, Takao
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (03): : 489 - 497
[2] Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
Yamagishi, Junichi
Watts, Oliver
King, Simon
Usabaev, Bela
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 418 - +
[3] SPEAKER-INDEPENDENT STYLE CONVERSION FOR HMM-BASED EXPRESSIVE SPEECH SYNTHESIS
Kanagawa, Hiroki
Nose, Takashi
Kobayashi, Takao
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7864 - 7868
[4] A style control technique for HMM-based expressive speech synthesis
Nose, Takashi
Yamagishi, Junichi
Masuko, Takashi
Kobayashi, Takao
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (09) : 1406 - 1413
[5] HMM-based Speaker Characteristics Emphasis Using Average Voice Model
Nose, Takashi
Adada, Junichi
Kobayashi, Takao
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2599 - 2602
[6] A training method of average voice model for HMM-based speech synthesis
Yamagishi, J
Tamura, M
Masuko, T
Tokuda, K
Kobayashi, T
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2003, E86A (08) : 1956 - 1963
[7] Speaking style adaptation using context clustering decision tree for HMM-based speech synthesis
Yamagishi, J
Tachibana, M
Masuko, T
Kobayashi, T
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 5 - 8
[8] Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis
Gao, Weixun
Cao, Qiying
[J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (04) : 1149 - 1166
[9] Speaker Adaptation using Nonlinear Regression Techniques for HMM-based Speech Synthesis
Hong, Doo Hwa
Kang, Shin Jae
Lee, Joun Yeop
Kim, Nam Soo
[J]. 2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 586 - 589
[10] An acoustic model adaptation using hmm-based speech synthesis
Tanaka, K
Kuroiwa, S
Tsuge, S
Ren, F
[J]. 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 368 - 373

← 1 2 3 4 5 →