On the Role of Spectral Dynamics in Unit Selection Speech Synthesis

被引:0
|
作者
Kirkpatrick, Barry [1 ]
O'Brien, Darragh [1 ]
Scaife, Ronan [1 ]
Errity, Andrew [1 ]
机构
[1] Dublin City Univ, Fac Engn & Comp, Res Inst Networks & Commun Engn, Dublin 9, Ireland
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
speech synthesis; join costs; auditory perception; spectral dynamics; feature extraction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cost functions employed in unit selection significantly influence the quality of speech output. Although unit selection can produce very natural sounding speech the quality can be inconsistent and is difficult to guarantee due to discontinuities between incompatible units. The join cost employed in unit selection to measure the suitability of concatenating speech units typically consists of sub costs representing the fundamental frequency and spectrum at the boundaries of each unit. In this study the role of spectral dynamics as a join cost in unit selection synthesis is explored. A number of spectral dynamic measures are tested for the task of detecting discontinuities. Results indicate that spectral dynamic measures correlate with human perception of discontinuity if the features are extracted appropriately. Spectral dynamic mismatch is found to be a source of discontinuity although results suggest this is likely to occur simultaneously with static spectral mismatch.
引用
收藏
页码:2029 / 2032
页数:4
相关论文
共 50 条
  • [41] Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech
    Barra-Chicote, Roberto
    Yamagishi, Junichi
    King, Simon
    Manuel Montero, Juan
    Macias-Guarasa, Javier
    SPEECH COMMUNICATION, 2010, 52 (05) : 394 - 404
  • [42] Globally optimal training of unit boundaries in unit selection text-to-speech synthesis
    Bellegarda, Jerome R.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 957 - 965
  • [43] Automatic Segmentation Quality Improvement for Realization of Unit Selection Speech Synthesis
    Szklarmy, Krzysztof
    Wojtowski, Michal
    2008 CONFERENCE ON HUMAN SYSTEM INTERACTIONS, VOLS 1 AND 2, 2008, : 245 - 250
  • [44] Evaluation of Finnish Unit Selection and HMM-based Speech Synthesis
    Silen, Hanna
    Helander, Elina
    Nurminen, Jani
    Gabbouji, Moncef
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1853 - +
  • [45] Learned dictionaries for sparse representation based unit selection speech synthesis
    Sharma, Pulkit
    Abrol, Vinayak
    Sao, Anil Kumar
    2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
  • [46] A statistical method for database reduction for embedded unit selection speech synthesis
    Tsiakoulis, Pirros
    Chalamandaris, Aimilios
    Karabetsos, Sotiris
    Raptis, Spyros
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4601 - 4604
  • [47] Maximum Likelihood Unit Selection for Corpus-based Speech Synthesis
    Gamboa Rosales, Abubeker
    Rosales, Hamurabi Gamboa
    Hoffmann, Ruediger
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 748 - +
  • [48] Concatenative speech synthesis based on the plural unit selection and fusion method
    Mizutani, T
    Kagoshima, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (11): : 2565 - 2572
  • [49] An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System
    Tsiakoulis, Pirros
    Karabetsos, Sotiris
    Chalamandaris, Aimilios
    Raptis, Spyros
    ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 370 - 383
  • [50] Continuity Metric for Unit Selection based Text-to-Speech Synthesis
    Lakkavalli, Vikram Ramesh
    Arulmozhi, P.
    Ramakrishnan, A. G.
    2010 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2010,