Outlier Detection and Removal for HMM-Based Speech Synthesis with an Insufficient Speech Database

被引：1

作者：

Hong, Doo Hwa ^{[1
]}

Sung, June Sig

Oh, Kyung Hwan

Kim, Nam Soo

机构：

[1] Seoul Natl Univ, Sch Elect Engn, Seoul 151742, South Korea

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2012年 / E95D卷 / 09期

基金：

新加坡国家研究基金会;

关键词：

HMM-based speech synthesis; decision tree-based clustering; outlier detection; insufficient speech database; ALGORITHM;

D O I：

10.1587/transinf.E95.D.2351

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Decision tree-based clustering and parameter estimation are essential steps in the training part of an HMM-based speech synthesis system. These two steps are usually performed based on the maximum likelihood (ML) criterion. However, one of the drawbacks of the ML criterion is that it is sensitive to outliers which usually result in quality degradation of the synthesized speech. In this letter, we propose an approach to detect and remove outliers for HMM-based speech synthesis. Experimental results show that the proposed approach can improve the synthetic speech, particularly when the available training speech database is insufficient.

引用

页码：2351 / 2354

页数：4

共 50 条

[21] Optimal Number of States in HMM-Based Speech Synthesis
Hanzlicek, Zdenek
TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
[22] A trainable excitation model for HMM-based speech synthesis
Maia, R.
Toda, T.
Zen, H.
Nankaku, Y.
Tokuda, K.
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
[23] Speaker interpolation for HMM-based speech synthesis system
Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):
[24] Contextual Additive Structure for HMM-Based Speech Synthesis
Takaki, Shinji
Nankaku, Yoshihiko
Tokuda, Keiichi
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 229 - 238
[25] Parameterization of Vocal Fry in HMM-Based Speech Synthesis
Silen, Hanna
Helander, Elina
Nurminen, Jani
Gabbouj, Moncef
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1735 - +
[26] The Design and Implementation of HMM-based Dai Speech Synthesis
Wang, Zhan
Yang, Jian
Yang, Xin
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[27] An HMM-based speech synthesis system applied to English
Tokuda, K
Zen, H
Black, AW
PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 227 - 230
[28] REACTIVE AND CONTINUOUS CONTROL OF HMM-BASED SPEECH SYNTHESIS
Astrinaki, Maria
d'Alessandro, Nicolas
Picart, Benjamin
Drugman, Thomas
Dutoit, Thierry
2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 252 - 257
[29] Creation of HMM-based Speech Model for Estonian Text-to-Speech Synthesis
Nurk, Tonis
HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 162 - 168
[30] DIALOGUE CONTEXT SENSITIVE HMM-BASED SPEECH SYNTHESIS
Tsiakoulis, Pirros
Breslin, Catherine
Gasic, Milica
Henderson, Matthew
Kim, Dongho
Szummer, Martin
Thomson, Blaise
Young, Steve
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,

← 1 2 3 4 5 →