Time and space-efficient architecture for a corpus-based text-to-speech synthesis system

被引:17
|
作者
Rojc, Matej [1 ]
Kacic, Zdravko [1 ]
机构
[1] Univ Maribor, Fac Elect Engn & Comp Sci, SLO-2000 Maribor, Slovenia
关键词
corpus-based text-to-speech system; finite-state machines (FSM); heterogeneous-relation graphs (HRG); queuing mechanism;
D O I
10.1016/j.specom.2007.01.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a time and space-efficient architecture for a text-to-speech synthesis system (TTS). The proposed architecture can be efficiently used in those applications with unlimited domain, requiring multilingual or polyglot functionality. The integration of a queuing mechanism, heterogeneous graphs and finite-state machines gives a powerful, reliable and easily maintainable architecture for the TTS system. Flexible and language-independent framework efficiently integrates all those algorithms used within the scope of the TTS system. Heterogeneous relation graphs are used for linguistic information representation and feature construction. Finite-state machines are used for time and space-efficient representation of language resources, for time and space-efficient lookup processes, and the separation of language-dependent resources from a language-independent TTS engine. Its queuing mechanism consists of several dequeue data structures and is responsible for the activation of all those TTS engine modules having to process the input text. In the proposed architecture, all modules use the same data structure for gathering linguistic information about input text. All input and output formats are compatible, the structure is modular and interchangeable, it is easily maintainable and object oriented. The proposed architecture was successfully used when implementing the Slovenian PLATTOS corpus-based TTS system, as presented in this paper. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:230 / 249
页数:20
相关论文
共 50 条
  • [1] Corpus-based Malay Text-to-Speech Synthesis System
    Swee, Tan Tian
    Salleh, Sheikh Hussain Shaikh
    [J]. 2008 14TH ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS, (APCC), VOLS 1 AND 2, 2008, : 52 - 56
  • [2] A new Korean corpus-based text-to-speech system
    Kim S.
    Lee Y.
    Hirose K.
    [J]. International Journal of Speech Technology, 2002, 5 (2) : 105 - 116
  • [3] An objective measure for assement of a corpus-based text-to-speech system
    Xu, J
    Guan, CT
    Li, HZ
    [J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 179 - 182
  • [4] A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
    Chou, FC
    Tseng, CY
    Lee, LS
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07): : 481 - 494
  • [5] A study of prosodic variability methods in a corpus-based unit selection text-to-speech system
    Csapo, Tamas Gabor
    Zainko, Csaba
    Nemeth, Geza
    [J]. INFOCOMMUNICATIONS JOURNAL, 2010, 2 (01): : 32 - 37
  • [6] An LSTM-based model for the compression of acoustic inventories for corpus-based text-to-speech synthesis systems
    Rojc, Matej
    Mlakar, Izidor
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100
  • [7] Unit generation based on phrase break strength and pruning for corpus-based text-to-speech
    Kim, S
    Lee, Y
    Hirose, K
    [J]. ETRI JOURNAL, 2001, 23 (04) : 168 - 176
  • [8] An efficient Mandarin text-to-speech system on time domain
    Lin, YJ
    Yu, MS
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1998, E81D (06): : 545 - 555
  • [9] RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
    Zandie, Rohola
    Mahoor, Mohammad H.
    Madsen, Julia
    Emamian, Eshrat S.
    [J]. INTERSPEECH 2021, 2021, : 2751 - 2755
  • [10] A corpus-based speech synthesis system with emotion
    Iida, A
    Campbell, N
    Higuchi, F
    Yasumura, M
    [J]. SPEECH COMMUNICATION, 2003, 40 (1-2) : 161 - 187