Discrete/Continuous Modelling of Speaking Style in HMM-based Speech Synthesis: Design and Evaluation

被引：0

作者：

Obin, Nicolas ^{[1
,2
]}

Lanchantin, Pierre ^{[1
]}

Lacheret, Anne ^{[2
]}

Rodet, Xavier ^{[1
]}

机构：

[1] IRCAM, Paris, France

[2] Univ Paris Ouest, Modyco Lab, Nanterre, France

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

speaking style; speech synthesis; speech prosody; average modelling;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper assesses the ability of a HMM-based speech synthesis systems to model the speech characteristics of various speaking styles(1). A discrete/continuous HAMM is presented to model the symbolic and acoustic speech characteristics of a speaking style. The proposed model is used to model the average characteristics of a speaking style that is shared among various speakers, depending on specific situations of speech communication. The evaluation consists of an identification experiment of 4 speaking styles based on delexicalized speech, and compared to a similar experiment on natural speech. The comparison is discussed and reveals that discrete/continuous HMM consistently models the speech characteristics of a speaking style.

引用

页码：2796 / +

页数：2

共 50 条

[31] Analysis of HMM-Based Lombard Speech Synthesis
Raitio, Tuomo
Suni, Antti
Vainio, Martti
Alku, Paavo
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2792 - +
[32] Speech parameter generation algorithms for HMM-based speech synthesis
Tokuda, K
Yoshimura, T
Masuko, T
Kobayashi, T
Kitamura, T
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1315 - 1318
[33] EVALUATION OF HMM-BASED LAUGHTER SYNTHESIS
Urbain, Jerome
Cakmak, Huseyin
Dutoit, Thierry
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7835 - 7839
[34] Pitch dependent phone modelling for HMM-based speech recognition
Singer, H.
Sagayama, S.
[J]. Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 1994, 15 (02):
[35] State duration modeling for HMM-based speech synthesis
Zen, Heiga
Masuko, Takashi
Tokuda, Keiichi
Yoshimura, Takayoshi
Kobayasih, Takao
Kitamura, Tadashi
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
[36] Analysis and HMM-based synthesis of hypo and hyperarticulated speech
Picart, Benjamin
Drugman, Thomas
Dutoit, Thierry
[J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 687 - 707
[37] Optimal Number of States in HMM-Based Speech Synthesis
Hanzlicek, Zdenek
[J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
[38] HMM-based phonetic engine for continuous speech of a regional language
Kaur, Rupinderdeep
Sharma, R. K.
Kumar, Parteek
[J]. MODERN PHYSICS LETTERS B, 2019, 33 (24):
[39] A trainable excitation model for HMM-based speech synthesis
Maia, R.
Toda, T.
Zen, H.
Nankaku, Y.
Tokuda, K.
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
[40] Speaker interpolation for HMM-based speech synthesis system
[J]. Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):

← 1 2 3 4 5 →