Spectral Smoothening Based Waveform Concatenation Technique for Speech Quality Enhancement in Text-to-Speech Systems

被引：1

作者：

Panda, Soumya Priyadarsini ^{[1
]}

Nayak, Ajit Kumar ^{[2
]}

机构：

[1] Silicon Inst Technol, Bhubaneswar, Odisha, India

[2] Inst Tech Educ & Res, Bhubaneswar, Odisha, India

来源：

ADVANCED COMPUTING AND INTELLIGENT ENGINEERING | 2020年 / 1082卷

关键词：

Speech synthesis; Waveform concatenation; Spectral smoothening; Optimal coupling; Speech quality;

D O I：

10.1007/978-981-15-1081-6_36

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work presents a spectral smoothening based concatenation technique for enhancing the quality of speech produced by Text-to-Speech systems. As, the hard waveform concatenation process may cause audible glitches at the segment boundaries affecting the overall quality of the produced speech, an optimal coupling based spectral smoothening approach is adopted to smoothen the spectral envelop of the produced speech for enhancing its quality. A number of experiments were performed to analyze the performance of the proposed technique for which different speech quality evaluation parameters are considered and the results are compared with the other con-catenative techniques. The results obtained in all the experiments performed shows the effectiveness of the proposed text-to-speech conversion technique in producing high-quality results.

引用

页码：425 / 432

页数：8

共 50 条

[1] A waveform concatenation technique for text-to-speech synthesis
Panda S.P.
Nayak A.K.
[J]. International Journal of Speech Technology, 2017, 20 (4) : 959 - 976
[2] Integration of Fuzzy If-Then Rule with Waveform Concatenation Technique for Text-to-Speech Synthesis in Odia
Panda, Soumya Priyadarsini
Nayak, Ajit Kumar
[J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT), 2014, : 88 - 93
[3] ENHANCEMENT OF SPECTRAL CLARITY FOR HMM-BASED TEXT-TO-SPEECH SYSTEMS
Joo, Young-Sun
Jung, Chi-Sang
Kang, Hong-Goo
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7840 - 7843
[4] EXAMPLAR-BASED SPEECH WAVEFORM GENERATION FOR TEXT-TO-SPEECH
Valentini-Botinhao, Cassia
Watts, Oliver
Espic, Felipe
King, Simon
[J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 332 - 338
[5] Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech
Valentini-Botinhao, Cassia
Yamagishi, Junichi
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (08) : 1420 - 1433
[6] Indonesian Text-To-Speech System Using Syllable Concatenation: Speech Optimization
Mengko, Richard
Ayuningtyas, Aulia
[J]. PROCEEDINGS OF 2013 3RD INTERNATIONAL CONFERENCE ON INSTRUMENTATION, COMMUNICATIONS, INFORMATION TECHNOLOGY, AND BIOMEDICAL ENGINEERING (ICICI-BME), 2013, : 412 - 415
[7] Integration of rule-based formant synthesis and waveform concatenation: A hybrid approach to text-to-speech synthesis
Hertz, SR
[J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 87 - 90
[8] Perceptual Quality Dimensions of Text-to-Speech Systems
Hinterleitner, Florian
Moeller, Sebastian
Norrenbrock, Christoph
Heute, Ulrich
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2188 - 2191
[9] Enhancing the Quality of Nepali Text-to-Speech Systems
Ghimire, Rupak Raj
Bal, Bal Krishna
[J]. CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, (CIT&DS), 2017, 754 : 187 - 197
[10] Comparison of measures of speech quality for listening tests of text-to-speech systems
Viswanathan, M
Viswanathan, M
[J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 11 - 14

← 1 2 3 4 5 →