A short latency unit selection method with redundant search for concatenative speech synthesis

被引:0
|
作者
Nishizawa, Nobuyuki
Kawai, Hisashi
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new method for short-latency unit selection is proposed. For prompt response in concatenative speech synthesis systems with large unit databases, waveforms should be output before all speech segment units of an utterance are determined. For that purpose, short-latency unit selection algorithms were introduced in our previous study. However, the short-latency unit selection may cause degradation of quality because units that consist of the optimal unit sequence may be pruned by forcible unit determination on the search. In the proposed method, the degradation of quality is suppressed by redundantly expanded hypotheses based on N-best search. The results of unit selection experiments in a practical configuration indicate that the proposed method is superior to the conventional DP search method when latency in unit selection is set to be short.
引用
收藏
页码:757 / 760
页数:4
相关论文
共 50 条
  • [1] Admissible stopping in Viterbi beam search for unit selection in concatenative speech synthesis
    Sakai, Shinsuke
    Kawahara, Tatsuya
    Nakamura, Satoshi
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4613 - 4616
  • [2] An efficient unit-selection method for embedded concatenative speech synthesis
    Gros, Jerneja Zganec
    Zganec, Mario
    [J]. INFORMACIJE MIDEM-JOURNAL OF MICROELECTRONICS ELECTRONIC COMPONENTS AND MATERIALS, 2007, 37 (03): : 158 - 164
  • [3] Concatenative speech synthesis based on the plural unit selection and fusion method
    Mizutani, T
    Kagoshima, T
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (11): : 2565 - 2572
  • [4] Scalable concatenative speech synthesis based on the plural unit selection and fusion method
    Tamura, M
    Mizutani, T
    Kagoshima, T
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 361 - 364
  • [5] Triphone based unit selection for concatenative visual speech synthesis
    Huang, FJ
    Cosatto, E
    Graf, HP
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 2037 - 2040
  • [6] Joint prosody prediction and unit selection for concatenative speech synthesis
    Bulyko, I
    Ostendorf, M
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 781 - 784
  • [7] An efficient unit-selection method for concatenative Text-to-speech synthesis systems
    Gros, Jerneja Zganec
    Zganec, Mario
    [J]. Journal of Computing and Information Technology, 2008, 16 (01) : 69 - 78
  • [8] Speech unit selection based on target values driven by speech data in concatenative speech synthesis
    Hirai, T
    Tenpaku, S
    Shikano, K
    [J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 43 - 46
  • [9] Fast concatenative speech synthesis using pre-fused speech units based on the plural unit selection and fusion method
    Tamura, Masatsune
    Mizutani, Tatsuya
    Kagoshima, Takehiko
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (02): : 544 - 553
  • [10] Acoustic speech unit segmentation for concatenative synthesis
    Torres, H. M.
    Gurlekian, J. A.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2008, 22 (02): : 196 - 206