Automatic Duration Weighting in Thai Unit-selection Speech Synthesis

被引：1

作者：

Saychum, S. ^{[1
]}

Rugchatjaroen, A. ^{[1
]}

Thatphithakkul, N. ^{[1
]}

Wutiwiwatchai, C. ^{[1
]}

Thangthai, A. ^{[1
]}

机构：

[1] Natl Elect & Comp Technol Ctr, Human Language Technol Lab, Pathum Thani, Thailand

来源：

ECTI-CON 2008: PROCEEDINGS OF THE 2008 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, VOLS 1 AND 2 | 2008年

关键词：

D O I：

10.1109/ECTICON.2008.4600492

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents the naturalness improvement in Thai unit-selection text-to-speech synthesis (TTS) by automatic weighting of targeted cost. An intuition of the proposed method is that the sensitivity of human perception might be varied to different phonemic and prosodic units. In this work, the unit-selection targeted-cost of each phoneme unit is weighted differently according to its duration statistic and voicing characteristic. Two automatic weighting algorithms, based on the statistical mean and standard deviation of phoneme duration, are comparatively evaluated. A subjective test shows a 0.46 mean-opinion-score improvement over the baseline speech synthesized without targeted-cost weighting.

引用

页码：549 / 552

页数：4

共 50 条

[31] Unit selection speech synthesis in noise
Cernak, Milos
[J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 761 - 764
[32] Automatic statistical evaluation of quality of unit selection speech synthesis with different prosody manipulations
Pribil, Jiri
Pribilova, Anna
Matousek, Jindrich
[J]. JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2020, 71 (02): : 78 - 86
[33] Thai automatic speech recognition
Suebvisai, S
Charoenpomsawat, P
Black, A
Woszczyna, M
Schultz, T
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 857 - 860
[34] Unit-Selection Based Facial Video Manipulation Detection
Nielsen, Thomas
Khodabakhsh, Ali
Busch, Christoph
[J]. 2020 INTERNATIONAL CONFERENCE OF THE BIOMETRICS SPECIAL INTEREST GROUP (BIOSIG), 2020, P-306
[35] Assessing a Speaker for Fast Speech in Unit Selection Speech Synthesis
Moers, Donata
Wagner, Petra
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2015 - +
[36] Implementation and verification of speech database for unit selection speech synthesis
Szklanny, Krzysztof
Koszuta, Sebastian
[J]. PROCEEDINGS OF THE 2017 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2017, : 1263 - 1267
[37] Unit Selection Model in Arabic Speech Synthesis
Al-Saiyd, Nedhal A.
Hijjawi, Mohammad
[J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (04): : 126 - 131
[38] Optimal Utterance Selection for Unit Selection Speech Synthesis Databases
Alan W. Black
Kevin Lenzo
[J]. International Journal of Speech Technology, 2003, 6 (4) : 357 - 363
[39] Efficient and reliable perceptual weight tuning for unit-selection text-to-speech synthesis based on active interactive genetic algorithms: A proof-of-concept
Alias, Francesc
Formiga, Lluis
Llora, Xavier
[J]. SPEECH COMMUNICATION, 2011, 53 (05) : 786 - 800
[40] Prosody transplantation using unit-selection: Principles and early results
Thippareddy, Mythri
Ramasubramanian, V.
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES (CONECCT), 2015,

← 1 2 3 4 5 →