Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis

被引:0
|
作者
Stylianou, Yannis [1 ]
机构
[1] AT&T Lab-Research, Florham Park, NJ, United States
关键词
Database systems - Error compensation - Error correction - Functions - Mathematical models - Regression analysis - Signal filtering and prediction - Speech intelligibility;
D O I
暂无
中图分类号
学科分类号
摘要
In an effort to increase the naturalness of concatenative speech synthesis, large speech databases may be recorded. While it is desirable to have varied prosodic and spectral characteristics in the database, it is not desirable to have variable voice quality. In this paper we present an automatic method for voice quality assessment and correction, whenever necessary, of large speech databases for concatenative speech synthesis. The proposed method is based on the use of a Gaussian Mixture Model, GMM, to model the acoustic space of the speaker of the database and on autoregressive filters for compensation. An objective method to measure the effectiveness of the database correction based on a likelihood function for the speaker's GMM, is presented as well. Both objective and subjective results show that the proposed method succeeds in detecting voice quality problems and successfully corrects them. Results show a 14.2% improvement of the log-likelihood function after compensation.
引用
收藏
页码:377 / 380
相关论文
共 50 条
  • [1] Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis
    Stylianou, Y
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 377 - 380
  • [2] Selection in a concatenative speech synthesis system using a large speech database
    Hunt, AJ
    Black, AW
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 373 - 376
  • [3] High-Individuality Voice Conversion Based on Concatenative Speech Synthesis
    Fujii, Kei
    Okawa, Jun
    Suigetsu, Kaori
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007, 2007, 26 : 483 - 488
  • [4] SET OF CONCATENATIVE UNITS FOR SPEECH SYNTHESIS
    OLIVE, J
    LIBERMAN, M
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 : S130 - S130
  • [5] On the detection of discontinuities in concatenative speech synthesis
    Pantazis, Yannis
    Stylianou, Yannis
    [J]. PROGRESS IN NONLINEAR SPEECH PROCESSING, 2007, 4391 : 89 - +
  • [6] Spectral modification for concatenative speech synthesis
    Wouters, J
    Macon, MW
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 941 - 944
  • [7] Discriminative training for concatenative speech synthesis
    Kim, NS
    Park, SS
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (01) : 40 - 43
  • [8] Application of large speech databases for speech synthesis in artificial intelligence systems
    Lyudovik, T.V.
    Sazhok, N.N.
    [J]. Journal of Automation and Information Sciences, 2003, 35 (12) : 9 - 13
  • [9] Nonlinear speech features for the objective detection of discontinuities in concatenative speech synthesis
    Pantazis, Y
    Stylianou, Y
    [J]. NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 375 - 383
  • [10] Forward masking phenomenon in concatenative speech synthesis
    Cernak, M
    Rozinaj, G
    [J]. PROCEEDINGS EC-VIP-MC 2003, VOLS 1 AND 2, 2003, : 691 - 694