A waveform concatenation technique for text-to-speech synthesis

被引:7
|
作者
Panda S.P. [1 ]
Nayak A.K. [2 ]
机构
[1] Department of CSE, Silicon Institute of Technology, Bhubaneswar, Odisha
[2] Department of CS&IT, Siksha ‘O’ Anusandhan University, Bhubaneswar, Odisha
关键词
Concatenative technique; Indian languages; Speech synthesis; Text-to-speech system; Waveform concatenation;
D O I
10.1007/s10772-017-9463-8
中图分类号
学科分类号
摘要
Designing text-to-speech systems capable of producing natural sounding speech segments in different Indian languages is a challenging and ongoing problem. Due to the large number of possible pronunciations in different Indian languages, a number of speech segments are needed to be stored in the speech database while a concatenative speech synthesis technique is used to achieve highly natural speech segments. However, the large speech database size makes it unusable for small hand held devices or human computer interactive systems with limited storage resources. In this paper, we proposed a fraction-based waveform concatenation technique to produce intelligible speech segments from a small footprint speech database. The results of all the experiments performed shows the effectiveness of the proposed technique in producing intelligible speech segments in different Indian languages even with very less storage and computation overhead compared to the existing syllable-based technique. © 2017, Springer Science+Business Media, LLC.
引用
收藏
页码:959 / 976
页数:17
相关论文
共 50 条
  • [1] Integration of Fuzzy If-Then Rule with Waveform Concatenation Technique for Text-to-Speech Synthesis in Odia
    Panda, Soumya Priyadarsini
    Nayak, Ajit Kumar
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT), 2014, : 88 - 93
  • [2] Spectral Smoothening Based Waveform Concatenation Technique for Speech Quality Enhancement in Text-to-Speech Systems
    Panda, Soumya Priyadarsini
    Nayak, Ajit Kumar
    [J]. ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, 2020, 1082 : 425 - 432
  • [3] Integration of rule-based formant synthesis and waveform concatenation: A hybrid approach to text-to-speech synthesis
    Hertz, SR
    [J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 87 - 90
  • [4] Combining concatenation and formant synthesis for improved intelligibility and naturalness in text-to-speech systems
    Panasonic Technologies, Inc, Santa Barbara, United States
    [J]. Int J Speech Technol, 2 (103-107):
  • [5] Combining concatenation and formant synthesis for improved intelligibility and naturalness in text-to-speech systems
    Pearson S.
    [J]. International Journal of Speech Technology, 1997, 1 (2) : 103 - 107
  • [6] Indonesian Text-To-Speech System Using Syllable Concatenation: Speech Optimization
    Mengko, Richard
    Ayuningtyas, Aulia
    [J]. PROCEEDINGS OF 2013 3RD INTERNATIONAL CONFERENCE ON INSTRUMENTATION, COMMUNICATIONS, INFORMATION TECHNOLOGY, AND BIOMEDICAL ENGINEERING (ICICI-BME), 2013, : 412 - 415
  • [7] TEXT-TO-SPEECH SYNTHESIS
    SPROAT, RW
    OLIVE, JP
    [J]. AT&T TECHNICAL JOURNAL, 1995, 74 (02): : 35 - 44
  • [8] EXAMPLAR-BASED SPEECH WAVEFORM GENERATION FOR TEXT-TO-SPEECH
    Valentini-Botinhao, Cassia
    Watts, Oliver
    Espic, Felipe
    King, Simon
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 332 - 338
  • [9] Text and Speech Corpora for Text-To-Speech Synthesis of Tales
    Doukhan, David
    Rosset, Sophie
    Rilliard, Albert
    d'Alessandro, Christophe
    Adda-Decker, Martine
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1003 - 1010
  • [10] Multilingual text-to-speech synthesis
    Black, AW
    Lenzo, KA
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 761 - 764