Spectral Smoothening Based Waveform Concatenation Technique for Speech Quality Enhancement in Text-to-Speech Systems

被引:1
|
作者
Panda, Soumya Priyadarsini [1 ]
Nayak, Ajit Kumar [2 ]
机构
[1] Silicon Inst Technol, Bhubaneswar, Odisha, India
[2] Inst Tech Educ & Res, Bhubaneswar, Odisha, India
关键词
Speech synthesis; Waveform concatenation; Spectral smoothening; Optimal coupling; Speech quality;
D O I
10.1007/978-981-15-1081-6_36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work presents a spectral smoothening based concatenation technique for enhancing the quality of speech produced by Text-to-Speech systems. As, the hard waveform concatenation process may cause audible glitches at the segment boundaries affecting the overall quality of the produced speech, an optimal coupling based spectral smoothening approach is adopted to smoothen the spectral envelop of the produced speech for enhancing its quality. A number of experiments were performed to analyze the performance of the proposed technique for which different speech quality evaluation parameters are considered and the results are compared with the other con-catenative techniques. The results obtained in all the experiments performed shows the effectiveness of the proposed text-to-speech conversion technique in producing high-quality results.
引用
收藏
页码:425 / 432
页数:8
相关论文
共 50 条
  • [1] A waveform concatenation technique for text-to-speech synthesis
    Panda S.P.
    Nayak A.K.
    [J]. International Journal of Speech Technology, 2017, 20 (4) : 959 - 976
  • [2] Integration of Fuzzy If-Then Rule with Waveform Concatenation Technique for Text-to-Speech Synthesis in Odia
    Panda, Soumya Priyadarsini
    Nayak, Ajit Kumar
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT), 2014, : 88 - 93
  • [3] ENHANCEMENT OF SPECTRAL CLARITY FOR HMM-BASED TEXT-TO-SPEECH SYSTEMS
    Joo, Young-Sun
    Jung, Chi-Sang
    Kang, Hong-Goo
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7840 - 7843
  • [4] EXAMPLAR-BASED SPEECH WAVEFORM GENERATION FOR TEXT-TO-SPEECH
    Valentini-Botinhao, Cassia
    Watts, Oliver
    Espic, Felipe
    King, Simon
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 332 - 338
  • [5] Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech
    Valentini-Botinhao, Cassia
    Yamagishi, Junichi
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (08) : 1420 - 1433
  • [6] Indonesian Text-To-Speech System Using Syllable Concatenation: Speech Optimization
    Mengko, Richard
    Ayuningtyas, Aulia
    [J]. PROCEEDINGS OF 2013 3RD INTERNATIONAL CONFERENCE ON INSTRUMENTATION, COMMUNICATIONS, INFORMATION TECHNOLOGY, AND BIOMEDICAL ENGINEERING (ICICI-BME), 2013, : 412 - 415
  • [7] Integration of rule-based formant synthesis and waveform concatenation: A hybrid approach to text-to-speech synthesis
    Hertz, SR
    [J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 87 - 90
  • [8] Perceptual Quality Dimensions of Text-to-Speech Systems
    Hinterleitner, Florian
    Moeller, Sebastian
    Norrenbrock, Christoph
    Heute, Ulrich
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2188 - 2191
  • [9] Enhancing the Quality of Nepali Text-to-Speech Systems
    Ghimire, Rupak Raj
    Bal, Bal Krishna
    [J]. CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, (CIT&DS), 2017, 754 : 187 - 197
  • [10] Comparison of measures of speech quality for listening tests of text-to-speech systems
    Viswanathan, M
    Viswanathan, M
    [J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 11 - 14