On the Role of Spectral Dynamics in Unit Selection Speech Synthesis

被引:0
|
作者
Kirkpatrick, Barry [1 ]
O'Brien, Darragh [1 ]
Scaife, Ronan [1 ]
Errity, Andrew [1 ]
机构
[1] Dublin City Univ, Fac Engn & Comp, Res Inst Networks & Commun Engn, Dublin 9, Ireland
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
speech synthesis; join costs; auditory perception; spectral dynamics; feature extraction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cost functions employed in unit selection significantly influence the quality of speech output. Although unit selection can produce very natural sounding speech the quality can be inconsistent and is difficult to guarantee due to discontinuities between incompatible units. The join cost employed in unit selection to measure the suitability of concatenating speech units typically consists of sub costs representing the fundamental frequency and spectrum at the boundaries of each unit. In this study the role of spectral dynamics as a join cost in unit selection synthesis is explored. A number of spectral dynamic measures are tested for the task of detecting discontinuities. Results indicate that spectral dynamic measures correlate with human perception of discontinuity if the features are extracted appropriately. Spectral dynamic mismatch is found to be a source of discontinuity although results suggest this is likely to occur simultaneously with static spectral mismatch.
引用
收藏
页码:2029 / 2032
页数:4
相关论文
共 50 条
  • [21] Spectral dynamics as a source of discontinuity in concatenative speech synthesis
    Kirkpatrick, Barry
    O'Brien, Darragh
    Scaife, Ronan
    Errity, Andrew
    PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 615 - +
  • [22] Speech unit selection based on target values driven by speech data in concatenative speech synthesis
    Hirai, T
    Tenpaku, S
    Shikano, K
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 43 - 46
  • [23] Minimum unit selection error training for HMM-based unit selection speech synthesis system
    Ling, Zhen-Hua
    Wang, Ren-Hua
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3949 - 3952
  • [24] Building of a Speech Corpus Optimised for Unit Selection TTS Synthesis
    Matousek, Jindrich
    Tihelka, Daniel
    Romportl, Jan
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1296 - 1299
  • [25] A comparison of unit selection techniques in limited domain speech synthesis
    Batusek, R
    Gaura, P
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 251 - 256
  • [26] Optimizing Phonetic Encoding for Viennese Unit Selection Speech Synthesis
    Pucher, Michael
    Neubarth, Friedrich
    Strom, Volker
    DEVELOPMENT OF MULTIMODAL INTERFACES: ACTIVE LISTING AND SYNCHRONY, 2010, 5967 : 207 - +
  • [27] Unit Selection based Speech Synthesis for Poor Channel Condition
    Cen, Ling
    Dong, Minghui
    Chan, Paul
    Li, Haizhou
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2035 - 2038
  • [28] Trainable unit selection speech synthesis under statistical framework
    WANG RenHua
    Science Bulletin, 2009, (11) : 1963 - 1969
  • [29] Triphone based unit selection for concatenative visual speech synthesis
    Huang, FJ
    Cosatto, E
    Graf, HP
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 2037 - 2040
  • [30] A RESEARCH BED FOR UNIT SELECTION BASED TEXT TO SPEECH SYNTHESIS
    Sarathy, K. Partha
    Ramakrishnan, A. G.
    2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 229 - +