Automatic Duration Weighting in Thai Unit-selection Speech Synthesis

被引:1
|
作者
Saychum, S. [1 ]
Rugchatjaroen, A. [1 ]
Thatphithakkul, N. [1 ]
Wutiwiwatchai, C. [1 ]
Thangthai, A. [1 ]
机构
[1] Natl Elect & Comp Technol Ctr, Human Language Technol Lab, Pathum Thani, Thailand
关键词
D O I
10.1109/ECTICON.2008.4600492
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents the naturalness improvement in Thai unit-selection text-to-speech synthesis (TTS) by automatic weighting of targeted cost. An intuition of the proposed method is that the sensitivity of human perception might be varied to different phonemic and prosodic units. In this work, the unit-selection targeted-cost of each phoneme unit is weighted differently according to its duration statistic and voicing characteristic. Two automatic weighting algorithms, based on the statistical mean and standard deviation of phoneme duration, are comparatively evaluated. A subjective test shows a 0.46 mean-opinion-score improvement over the baseline speech synthesized without targeted-cost weighting.
引用
收藏
页码:549 / 552
页数:4
相关论文
共 50 条
  • [31] Unit selection speech synthesis in noise
    Cernak, Milos
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 761 - 764
  • [32] Automatic statistical evaluation of quality of unit selection speech synthesis with different prosody manipulations
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2020, 71 (02): : 78 - 86
  • [33] Thai automatic speech recognition
    Suebvisai, S
    Charoenpomsawat, P
    Black, A
    Woszczyna, M
    Schultz, T
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 857 - 860
  • [34] Unit-Selection Based Facial Video Manipulation Detection
    Nielsen, Thomas
    Khodabakhsh, Ali
    Busch, Christoph
    [J]. 2020 INTERNATIONAL CONFERENCE OF THE BIOMETRICS SPECIAL INTEREST GROUP (BIOSIG), 2020, P-306
  • [35] Assessing a Speaker for Fast Speech in Unit Selection Speech Synthesis
    Moers, Donata
    Wagner, Petra
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2015 - +
  • [36] Implementation and verification of speech database for unit selection speech synthesis
    Szklanny, Krzysztof
    Koszuta, Sebastian
    [J]. PROCEEDINGS OF THE 2017 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2017, : 1263 - 1267
  • [37] Unit Selection Model in Arabic Speech Synthesis
    Al-Saiyd, Nedhal A.
    Hijjawi, Mohammad
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (04): : 126 - 131
  • [38] Optimal Utterance Selection for Unit Selection Speech Synthesis Databases
    Alan W. Black
    Kevin Lenzo
    [J]. International Journal of Speech Technology, 2003, 6 (4) : 357 - 363
  • [39] Efficient and reliable perceptual weight tuning for unit-selection text-to-speech synthesis based on active interactive genetic algorithms: A proof-of-concept
    Alias, Francesc
    Formiga, Lluis
    Llora, Xavier
    [J]. SPEECH COMMUNICATION, 2011, 53 (05) : 786 - 800
  • [40] Prosody transplantation using unit-selection: Principles and early results
    Thippareddy, Mythri
    Ramasubramanian, V.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES (CONECCT), 2015,