Scalable implementation of unit selection based text-to-speech system for embedded solutions

被引：0

作者：

Nukaga, Nobuo ^{[1
]}

Kamoshida, Ryota ^{[1
]}

Nagamatsu, Kenji ^{[1
]}

Kitahara, Yoshinori ^{[1
]}

机构：

[1] Hitachi Ltd, Cent Res Lab, Kokubunji, Tokyo 1858601, Japan

来源：

2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 | 2006年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper we propose two methods in order to implement unit selection-based text-to-speech engine into resource-limited embedded systems. While we have achieved improving the quality of synthesized speech by unit selection-based text-to-speech technology, there is a practical problem regarding the trade-off between the size of database and the quality of synthesized speech. That is, we need large database and expensive computation in order to generate highly natural sounding voices, and the text-to-speech system is required to meet the specification of target system. For this problem, we introduced frequency-based approaches to reduce the size of speech database. The experimental results showed the step-by-step downsizing method was better than the direct one in terms of the cumulative join cost and the target cost. Furthermore, some techniques were introduced and evaluated in order to implement our text-to-speech engine into an embedded system. From experimental results, it developed that the run-time work load for the test sentences was 80 MIPS approximately and the implemented engine was useful and scalable for mid-class embedded system.

引用

页码：849 / 852

页数：4

共 50 条

[1] Embedded Unit Selection Text-to-Speech Synthesis for Mobile Devices
Karabetsos, Sotiris
Tsiakoulis, Pirros
Chalamandaris, Aimilios
Raptis, Spyros
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2009, 55 (02) : 613 - 621
[2] ARM based implementation of Text-To-Speech (TTS) for real time Embedded System
Rawoof, Abdul
Kulesh
Ray, Kailash Chandra
[J]. 2014 FIFTH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2014), 2014, : 192 - 196
[3] Applying Scalable Phonetic Context Similarity in Unit Selection of Concatenative Text-to-Speech
Zhang, Wei
Cui, Xiaodong
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 154 - 157
[4] An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System
Tsiakoulis, Pirros
Karabetsos, Sotiris
Chalamandaris, Aimilios
Raptis, Spyros
[J]. ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 370 - 383
[5] Continuity Metric for Unit Selection based Text-to-Speech Synthesis
Lakkavalli, Vikram Ramesh
Arulmozhi, P.
Ramakrishnan, A. G.
[J]. 2010 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2010,
[6] Diphone-based unit selection for Catalan text-to-speech synthesis
Guaus, R
Iriondo, I
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 277 - 282
[7] Efficient Unit-Selection in Text-to-Speech Synthesis
Mihelic, Ales
Gros, Jerneja Zganec
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 411 - 418
[8] A study of prosodic variability methods in a corpus-based unit selection text-to-speech system
Csapo, Tamas Gabor
Zainko, Csaba
Nemeth, Geza
[J]. INFOCOMMUNICATIONS JOURNAL, 2010, 2 (01): : 32 - 37
[9] Implementation of a Text-to-Speech System for Kurdish Language
Daneshfar, Fatemeh
Barkhoda, Wafa
Azami, Bahram Zahir
[J]. ICDT: 2009 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL TELECOMMUNICATIONS, 2009, : 117 - 120
[10] A Unit Selection Text-to-Speech Synthesis System Optimized for Use with Screen Readers
Chalamandaris, Aimilios
Karabetsos, Sotiris
Tsiakoulis, Pirros
Raptis, Spyros
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (03) : 1890 - 1897

← 1 2 3 4 5 →