Scalable implementation of unit selection based text-to-speech system for embedded solutions

被引:0
|
作者
Nukaga, Nobuo [1 ]
Kamoshida, Ryota [1 ]
Nagamatsu, Kenji [1 ]
Kitahara, Yoshinori [1 ]
机构
[1] Hitachi Ltd, Cent Res Lab, Kokubunji, Tokyo 1858601, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose two methods in order to implement unit selection-based text-to-speech engine into resource-limited embedded systems. While we have achieved improving the quality of synthesized speech by unit selection-based text-to-speech technology, there is a practical problem regarding the trade-off between the size of database and the quality of synthesized speech. That is, we need large database and expensive computation in order to generate highly natural sounding voices, and the text-to-speech system is required to meet the specification of target system. For this problem, we introduced frequency-based approaches to reduce the size of speech database. The experimental results showed the step-by-step downsizing method was better than the direct one in terms of the cumulative join cost and the target cost. Furthermore, some techniques were introduced and evaluated in order to implement our text-to-speech engine into an embedded system. From experimental results, it developed that the run-time work load for the test sentences was 80 MIPS approximately and the implemented engine was useful and scalable for mid-class embedded system.
引用
收藏
页码:849 / 852
页数:4
相关论文
共 50 条
  • [1] Embedded Unit Selection Text-to-Speech Synthesis for Mobile Devices
    Karabetsos, Sotiris
    Tsiakoulis, Pirros
    Chalamandaris, Aimilios
    Raptis, Spyros
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2009, 55 (02) : 613 - 621
  • [2] ARM based implementation of Text-To-Speech (TTS) for real time Embedded System
    Rawoof, Abdul
    Kulesh
    Ray, Kailash Chandra
    [J]. 2014 FIFTH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2014), 2014, : 192 - 196
  • [3] Applying Scalable Phonetic Context Similarity in Unit Selection of Concatenative Text-to-Speech
    Zhang, Wei
    Cui, Xiaodong
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 154 - 157
  • [4] An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System
    Tsiakoulis, Pirros
    Karabetsos, Sotiris
    Chalamandaris, Aimilios
    Raptis, Spyros
    [J]. ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 370 - 383
  • [5] Continuity Metric for Unit Selection based Text-to-Speech Synthesis
    Lakkavalli, Vikram Ramesh
    Arulmozhi, P.
    Ramakrishnan, A. G.
    [J]. 2010 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2010,
  • [6] Diphone-based unit selection for Catalan text-to-speech synthesis
    Guaus, R
    Iriondo, I
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 277 - 282
  • [7] Efficient Unit-Selection in Text-to-Speech Synthesis
    Mihelic, Ales
    Gros, Jerneja Zganec
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 411 - 418
  • [8] A study of prosodic variability methods in a corpus-based unit selection text-to-speech system
    Csapo, Tamas Gabor
    Zainko, Csaba
    Nemeth, Geza
    [J]. INFOCOMMUNICATIONS JOURNAL, 2010, 2 (01): : 32 - 37
  • [9] Implementation of a Text-to-Speech System for Kurdish Language
    Daneshfar, Fatemeh
    Barkhoda, Wafa
    Azami, Bahram Zahir
    [J]. ICDT: 2009 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL TELECOMMUNICATIONS, 2009, : 117 - 120
  • [10] A Unit Selection Text-to-Speech Synthesis System Optimized for Use with Screen Readers
    Chalamandaris, Aimilios
    Karabetsos, Sotiris
    Tsiakoulis, Pirros
    Raptis, Spyros
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (03) : 1890 - 1897