The Target Cost Formulation in Unit Selection Speech Synthesis

被引:0
|
作者
Taylor, Paul [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
关键词
speech synthesis; unit selection; target cost; decision trees; neural networks;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We review the various approaches that have been used to define the target cost in unit selection speech synthesis and show that there are a number of different and sometimes incompatible ways of defining this. We propose that this cost should be thought of as a measure of how similar two units sound to a human listener. We discuss the issue of what features should be used in unit selection and the pros and cons of using derived features such as F0. We then explore some algorithms used to calculate target costs and show that none are really ideal for the problem. Finally, we propose a new solution to this that uses a neural network to synthesise points in acoustic space around which we can build new clusters of units at run time.
引用
收藏
页码:2038 / 2041
页数:4
相关论文
共 50 条
  • [1] Defining a Global Adaptive Duration Target Cost for Unit Selection Speech Synthesis
    Guennec, David
    Chevelu, Jonathan
    Lolive, Damien
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 149 - 157
  • [2] A classifier-based target cost for unit selection speech synthesis trained on perceptual data
    Strom, Volker
    King, Simon
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 150 - 153
  • [3] Speech unit selection based on target values driven by speech data in concatenative speech synthesis
    Hirai, T
    Tenpaku, S
    Shikano, K
    [J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 43 - 46
  • [4] Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
    Fu, Ruibo
    Tao, Jianhua
    Zheng, Yibin
    Wen, Zhengqi
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2514 - 2518
  • [5] OPTIMIZATION OF COST FUNCTION WEIGHTS FOR UNIT SELECTION SPEECH SYNTHESIS USING SPEECH RECOGNITION
    Pobar, Miran
    Martincic-Ipsic, Sanda
    Ipsic, Ivo
    [J]. NEURAL NETWORK WORLD, 2012, 22 (05) : 429 - 441
  • [6] Joint Target and Join Cost Weight Training for Unit Selection Synthesis
    Latacz, Lukas
    Mattheyses, Wesley
    Verhelst, Werner
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 328 - +
  • [7] Subjective evaluation of join cost and smoothing methods for unit selection speech synthesis
    Vepa, Jithendra
    King, Simon
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1763 - 1771
  • [8] Syllable specific unit selection cost functions for text-to-speech synthesis
    Narendra, N.P.
    Sreenivasa Rao, K.
    [J]. ACM Transactions on Speech and Language Processing, 2012, 9 (03):
  • [9] A Dynamic Cost Weighting Framework for Unit Selection Text-to-Speech Synthesis
    Bellegarda, Jerome R.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1455 - 1463
  • [10] Unit selection speech synthesis in noise
    Cernak, Milos
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 761 - 764