Evaluation of Finnish Unit Selection and HMM-based Speech Synthesis

被引：0

作者：

Silen, Hanna ^{[1
]}

Helander, Elina ^{[1
]}

Nurminen, Jani ^{[2
]}

Gabbouji, Moncef ^{[1
]}

机构：

[1] Tampere Univ Technol, Dept Signal Proc, Tampere, Finland

[2] Nokia Devices R&D, Tampere, Finland

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

基金：

芬兰科学院;

关键词：

speech synthesis; unit selection; hidden Markov models;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unit selection and hidden Markov model (HMM) based synthesis have become the dominant techniques in text-to-speech (US) research. In this work, we combine HMM-based signal generation with the front end originally designed for unit selection based Finnish ITS and we evaluate the prosody of the output generated by the two synthesis techniques using the same speech database. Furthermore, we study the effect that the training set size has for the prosody and intelligibility in HMM-based synthesis. The results indicate that the HMM-based approach is capable of providing better prosody than unit selection even if the training set size is severely limited. The size of the training set, however, affects the prosodic quality and intelligibility of the HMM-based synthesizer.

引用

页码：1853 / +

页数：2

共 50 条

[31] State duration modeling for HMM-based speech synthesis
Zen, Heiga
Masuko, Takashi
Tokuda, Keiichi
Yoshimura, Takayoshi
Kobayasih, Takao
Kitamura, Tadashi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
[32] Analysis and HMM-based synthesis of hypo and hyperarticulated speech
Picart, Benjamin
Drugman, Thomas
Dutoit, Thierry
COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 687 - 707
[33] Optimal Number of States in HMM-Based Speech Synthesis
Hanzlicek, Zdenek
TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
[34] A trainable excitation model for HMM-based speech synthesis
Maia, R.
Toda, T.
Zen, H.
Nankaku, Y.
Tokuda, K.
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
[35] Speaker interpolation for HMM-based speech synthesis system
Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):
[36] Contextual Additive Structure for HMM-Based Speech Synthesis
Takaki, Shinji
Nankaku, Yoshihiko
Tokuda, Keiichi
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 229 - 238
[37] Parameterization of Vocal Fry in HMM-Based Speech Synthesis
Silen, Hanna
Helander, Elina
Nurminen, Jani
Gabbouj, Moncef
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1735 - +
[38] Noise in HMM-Based Speech Synthesis Adaptation: Analysis, Evaluation Methods and Experiments
Karhila, Reima
Remes, Ulpu
Kurimo, Mikko
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 285 - 295
[39] REACTIVE AND CONTINUOUS CONTROL OF HMM-BASED SPEECH SYNTHESIS
Astrinaki, Maria
d'Alessandro, Nicolas
Picart, Benjamin
Drugman, Thomas
Dutoit, Thierry
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 252 - 257
[40] An HMM-based speech synthesis system applied to English
Tokuda, K
Zen, H
Black, AW
PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 227 - 230

← 1 2 3 4 5 →