Comparison of Approaches for Instrumentally Predicting the Quality of Text-To-Speech Systems

被引：0

作者：

Moeller, Sebastian ^{[1
]}

Hinterleitner, Florian ^{[1
]}

Falk, Tiago H. ^{[2
]}

Polzehl, Tim ^{[1
]}

机构：

[1] TU Berlin, Qual & Usabil Lab, Deutsch Telekom Labs, Berlin, Germany

[2] Bloorview Res Inst, Toronto, ON, Canada

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年

关键词：

speech synthesis; quality prediction; Quality of Experience (QoE);

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we compare and combine different approaches for instrumentally predicting the perceived quality of Text-to-Speech systems. First, a log-likelihood is determined by comparing features extracted from the synthesized speech signal with features trained on natural speech. Second, parameters are extracted which capture quality-relevant degradations of the synthesized speech signal. Both approaches are combined and evaluated on three auditory test databases. The results show that auditory quality judgments can in many cases be predicted with a sufficiently high accuracy and reliability, but that there are considerable differences, mainly between male and female speech samples.

引用

页码：1325 / +

页数：2

共 50 条

[1] Comparison of measures of speech quality for listening tests of text-to-speech systems
Viswanathan, M
Viswanathan, M
[J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 11 - 14
[2] Subjective evaluation and comparison of the speech quality of text-to-speech systems for the German language
Klaus, H
Fellbaum, K
Sotscheck, J
[J]. ACUSTICA, 1997, 83 (01): : 124 - 136
[3] Perceptual Quality Dimensions of Text-to-Speech Systems
Hinterleitner, Florian
Moeller, Sebastian
Norrenbrock, Christoph
Heute, Ulrich
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2188 - 2191
[4] Enhancing the Quality of Nepali Text-to-Speech Systems
Ghimire, Rupak Raj
Bal, Bal Krishna
[J]. CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, (CIT&DS), 2017, 754 : 187 - 197
[5] MOS and pair comparison combined methods for quality evaluation of text-to-speech systems
Salza, PL
Foti, E
Nebbia, L
Oreglia, M
[J]. ACUSTICA, 1996, 82 (04): : 650 - 656
[6] Predicting the Quality of Text-To-Speech Systems from a Large-Scale Feature Set
Hinterleitner, Florian
Norrenbrock, Christoph R.
Moeller, Sebastian
Heute, Ulrich
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 383 - 387
[7] Evaluating the pronunciation component of text-to-speech systems for English: a performance comparison of different approaches
Damper, RI
Marchand, Y
Adamson, MJ
Gustafson, K
[J]. COMPUTER SPEECH AND LANGUAGE, 1999, 13 (02): : 155 - 176
[8] Physiological Quality-of-Experience Assessment of Text-to-Speech Systems
Gupta, Rishabh
Falk, Tiago H.
[J]. 2016 IEEE 18TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2016,
[9] A text analyzer for Korean text-to-speech systems
Lee, SH
Oh, YH
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1692 - 1695
[10] Evaluation of Deep Learning Approaches to Text-to-Speech Systems for European Portuguese
Quintas, Sebastiao
Trancoso, Isabel
[J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020, 2020, 12037 : 34 - 42

← 1 2 3 4 5 →