Speech Synthesis Based on Gaussian Conditional Random Fields

被引:2
|
作者
Khorram, Soheil [1 ]
Bahmaninezhad, Fahimeh [1 ]
Sameti, Hossein [1 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
关键词
Gaussian conditional random field; Statistical parametric speech synthesis; HSMM extension; ALGORITHMS; HMM;
D O I
10.1007/978-3-319-10849-0_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hidden Markov Model (HMM)-based synthesis (HTS) has recently been confirmed to be the most effective method in generating natural speech. However, it lacks adequate context generalization when the training data is limited. As a solution, current study provides a new context-dependent speech modeling framework based on the Gaussian Conditional Random Field (GCRF) theory. By applying this model, an innovative speech synthesis system has been developed which can be viewed as an extension of Context-Dependent Hidden Semi Markov Model (CD-HSMM). A novel Viterbi decoder along with a stochastic gradient ascent algorithm was applied to train model parameters. Also, a fast and efficient parameter generation algorithm was derived for the synthesis part. Experimental results using objective and subjective criteria have shown that the proposed system outperforms HSMM substantially in limited speech databases. Moreover, Mel-cepstral distance of the spectral parameters has been reduced considerably for any size of training database.
引用
收藏
页码:183 / 193
页数:11
相关论文
共 50 条
  • [21] CONDITIONAL SIMULATION OF NON-GAUSSIAN RANDOM-FIELDS
    ELISHAKOFF, I
    REN, YJ
    SHINOZUKA, M
    [J]. ENGINEERING STRUCTURES, 1994, 16 (07) : 558 - 563
  • [22] Conditional-mean least-squares fitting of Gaussian Markov random fields to Gaussian fields
    Cressie, Noel
    Verzelen, Nicolas
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (05) : 2794 - 2807
  • [23] Urdu part of speech tagging using conditional random fields
    Khan, Wahab
    Daud, Ali
    Nasir, Jamal Abdul
    Amjad, Tehmina
    Arafat, Sachi
    Aljohani, Naif
    Alotaibi, Fahd S.
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2019, 53 (03) : 331 - 362
  • [24] Urdu part of speech tagging using conditional random fields
    Wahab Khan
    Ali Daud
    Jamal Abdul Nasir
    Tehmina Amjad
    Sachi Arafat
    Naif Aljohani
    Fahd S. Alotaibi
    [J]. Language Resources and Evaluation, 2019, 53 : 331 - 362
  • [25] IMAGE SYNTHESIS USING CONDITIONAL RANDOM FIELDS
    Ahmadi, E.
    Azimifar, Z.
    Fieguth, P.
    Ayatollahi, Sh.
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 3997 - 4000
  • [26] On the Equivalence of Gaussian HMM and Gaussian HMM-like Hidden Conditional Random Fields
    Heigold, Georg
    Schlueter, Ralf
    Ney, Hermann
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1273 - 1276
  • [27] ON THE CONDITIONAL DISTRIBUTIONS AND THE EFFICIENT SIMULATIONS OF EXPONENTIAL INTEGRALS OF GAUSSIAN RANDOM FIELDS
    Liu, Jingchen
    Xu, Gongjun
    [J]. ANNALS OF APPLIED PROBABILITY, 2014, 24 (04): : 1691 - 1738
  • [28] Theory and generation of conditional, scalable sub-Gaussian random fields
    Panzeri, M.
    Riva, M.
    Guadagnini, A.
    Neuman, S. P.
    [J]. WATER RESOURCES RESEARCH, 2016, 52 (03) : 1746 - 1761
  • [29] Learning Gaussian conditional random fields for low-level vision
    Tappen, Marshall F.
    Liu, Ce
    Adelson, Edward H.
    Freeman, William T.
    [J]. 2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 79 - +
  • [30] Sparse Gaussian Conditional Random Fields on Top of Recurrent Neural Networks
    Wang, Xishun
    Zhang, Minjie
    Ren, Fenghui
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4219 - 4226