A Covariance-Tying Technique for HMM-Based Speech Synthesis

被引：10

作者：

Oura, Keiichiro ^{[1
]}

Zen, Heiga ^{[1
]}

Nankaku, Yoshihiko ^{[1
]}

Lee, Akinobu ^{[1
]}

Tokuda, Keiichi ^{[1
]}

机构：

[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Nagoya, Aichi 4668555, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2010年 / E93D卷 / 03期

关键词：

HMM; speech synthesis; decision tree; context-clustering; MDL criterion; embedded device;

D O I：

10.1587/transinf.E93.D.595

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A technique for reducing the footprints of HMM-based speech synthesis systems by tying all covariance matrices of state distributions is described. HMM-based speech synthesis systems usually leave smaller footprints than unit-selection synthesis systems because they store statistics rather than speech waveforms. However, further reduction is essential to put them on embedded devices, which have limited memory. In accordance with the empirical knowledge that covariance matrices have a smaller impact on the quality of synthesized speech than mean vectors, we propose a technique for clustering mean vectors while tying all covariance matrices. Subjective listening test results showed that the proposed technique can shrink the footprints of an HMM-based speech synthesis system while retaining the quality of the synthesized speech.

引用

页码：595 / 601

页数：7

共 50 条

[1] Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems
Oura, Keiichiro
Zen, Heiga
Nankaku, Yoshihiko
Lee, Akinobu
Tokuda, Keiichi
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1723 - 1726
[2] Analysis of Stream-Dependent Tying Structure for HMM-based Speech Synthesis
Yu, Zhi-Peng
Wu, Yi-Jian
Zen, Heiga
Nankaku, Yoshihiko
Tokuda, Keiichi
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 655 - 658
[3] FULL COVARIANCE STATE DURATION MODELING FOR HMM-BASED SPEECH SYNTHESIS
Lu, Heng
Wu, Yi-Jian
Tokuda, Keiichi
Dai, Li-Rong
Wang, Ren-Hua
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4033 - +
[4] A style control technique for HMM-based expressive speech synthesis
Nose, Takashi
Yamagishi, Junichi
Masuko, Takashi
Kobayashi, Takao
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (09) : 1406 - 1413
[5] Croatian HMM-based speech synthesis
Department of Informatics, Faculty of Philosophy, University of Rijeka, Omladinska 14, Rijeka
51000, Croatia
J. Compt. Inf. Technol., 2006, 4 (307-313):
[6] HMM-Based Vietnamese Speech Synthesis
Trinh Quoc Son
2015 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2015, : 349 - 353
[7] Robustness of HMM-based Speech Synthesis
Yamagishi, Junichi
Ling, Zhenhua
King, Simon
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 581 - 584
[8] Czech HMM-Based Speech Synthesis
Hanzlicek, Zdenek
TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 291 - 298
[9] Prediction method of speech recognition performance based on HMM-based speech synthesis technique
Terashima R.
Yoshimura T.
Wakita T.
Tokuda K.
Kitamura T.
IEEJ Transactions on Electronics, Information and Systems, 2010, 130 (04) : 557 - 564+3
[10] Arabic HMM-based Speech Synthesis
Khalil, Krichi Mohamed
Adnan, Cherif
2013 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND SOFTWARE APPLICATIONS (ICEESA), 2013, : 450 - 454

← 1 2 3 4 5 →