Hybrid statistical/unit-selection Turkish speech synthesis using suffix units

被引:0
|
作者
Cenk Demiroğlu
Ekrem Güner
机构
[1] Ozyegin University,Electrical and Computer Engineering Department
关键词
Statistical speech synthesis; Hybrid speech synthesis; Suffix selection; Turkish;
D O I
暂无
中图分类号
学科分类号
摘要
Unit selection based text-to-speech synthesis (TTS) has been the dominant TTS approach of the last decade. Despite its success, unit selection approach has its disadvantages. One of the most significant disadvantages is the sudden discontinuities in speech that distract the listeners (Speech Commun 51:1039–1064, 2009). The second disadvantage is that significant expertise and large amounts of data is needed for building a high-quality synthesis system which is costly and time-consuming. The statistical speech synthesis (SSS) approach is a promising alternative synthesis technique. Not only that the spurious errors that are observed in the unit selection system are mostly not observed in SSS but also building voice models is far less expensive and faster compared to the unit selection system. However, the resulting speech is typically not as natural-sounding as speech that is synthesized with a high-quality unit selection system. There are hybrid methods that attempt to take advantage of both SSS and unit selection systems. However, existing hybrid methods still require development of a high-quality unit selection system. Here, we propose a novel hybrid statistical/unit selection system for Turkish that aims at improving the quality of the baseline SSS system by improving the prosodic parameters such as intonation and stress. Commonly occurring suffixes in Turkish are stored in the unit selection database and used in the proposed system. As opposed to existing hybrid systems, the proposed system was developed without building a complete unit selection synthesis system. Therefore, the proposed method can be used without collecting large amounts of data or utilizing substantial expertise or time-consuming tuning that is typically required in building unit selection systems. Listeners preferred the hybrid system over the baseline system in the AB preference tests.
引用
收藏
相关论文
共 50 条
  • [1] Hybrid statistical/unit-selection Turkish speech synthesis using suffix units
    Demiroglu, Cenk
    Guner, Ekrem
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016, : 1 - 16
  • [2] Unit-Selection Speech Synthesis Method Using Words as Search Units
    Segi, Hiroyuki
    [J]. INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2016, 7 (02): : 53 - 67
  • [3] Expressive Prosody for Unit-selection Speech Synthesis
    Strom, Volker
    Clark, Robert
    King, Simon
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1296 - 1299
  • [4] Efficient Unit-Selection in Text-to-Speech Synthesis
    Mihelic, Ales
    Gros, Jerneja Zganec
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 411 - 418
  • [5] On the Impact of Annotation Errors on Unit-Selection Speech Synthesis
    Matousek, Jindrich
    Tihelka, Daniel
    Smidl, Lubos
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 456 - 463
  • [6] A Small Footprint Hybrid Statistical and Unit Selection Text-to-Speech Synthesis System for Turkish
    Guner, Ekrem
    Demiroglu, Cenk
    [J]. COMPUTER AND INFORMATION SCIENCES II, 2012, : 85 - 91
  • [7] PROSODIC CONTROL OF UNIT-SELECTION SPEECH SYNTHESIS: A PROBABILISTIC APPROACH
    Veaux, Christophe
    Rodet, Xavier
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5360 - 5363
  • [8] An efficient unit-selection method for embedded concatenative speech synthesis
    Gros, Jerneja Zganec
    Zganec, Mario
    [J]. INFORMACIJE MIDEM-JOURNAL OF MICROELECTRONICS ELECTRONIC COMPONENTS AND MATERIALS, 2007, 37 (03): : 158 - 164
  • [9] Automatic Duration Weighting in Thai Unit-selection Speech Synthesis
    Saychum, S.
    Rugchatjaroen, A.
    Thatphithakkul, N.
    Wutiwiwatchai, C.
    Thangthai, A.
    [J]. ECTI-CON 2008: PROCEEDINGS OF THE 2008 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2008, : 549 - 552
  • [10] Slovak speech database for experiments and application building in unit-selection speech synthesis
    Rusko, M
    Trnka, M
    Darzágín, S
    Cernak, M
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 457 - 464