Rich Context Modeling for High Quality HMM-Based TTS

被引:0
|
作者
Yan, Zhi-Jie [1 ]
Qian, Yao [1 ]
Soong, Frank K. [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
HMM-based TTS; rich context modeling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a rich context modeling approach to high quality HMM-based speech synthesis. We first analyze the over-smoothing problem in conventional decision tree tying-based HMM, and then propose to model the training speech tokens with rich context models. Special training procedure is adopted for reliable estimation of the rich context model parameters. In synthesis, a search algorithm following a context-based pre-selection is performed to determine the optimal rich context model sequence which generates natural and crisp output speech. Experimental results show that spectral envelopes synthesized by the rich context models are with crisper formant structures and evolve with richer details than those obtained by the conventional models. The speech quality improvement is also perceived by listeners in a subjective preference test, in which 76% of the sentences synthesized using rich context modeling are preferred.
引用
收藏
页码:1767 / 1770
页数:4
相关论文
共 50 条
  • [41] AN HMM-BASED BEHAVIOR MODELING APPROACH FOR CONTINUOUS MOBILE AUTHENTICATION
    Roy, Aditi
    Halevi, Tzipora
    Memon, Nasir
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [42] Initialization, training, and context-dependency in HMM-based formant tracking
    Toledano, DT
    Villardebó, JG
    Gómez, LH
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 511 - 523
  • [43] AA SPECTRAL SPACE WARPING APPROACH TO CROSS-LINGUAL VOICE TRANSFORMATION IN HMM-BASED TTS
    Wang, Hao
    Soong, Frank
    Meng, Helen
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4874 - 4878
  • [44] HMM-based predictive model for enhancing data quality in WSN
    Xu X.
    Zhang Z.
    Chen Y.
    Li L.
    International Journal of Computers and Applications, 2020, 42 (04) : 351 - 359
  • [45] HMM-Based Trust Model
    Elsalamouny, Ehab
    Sassone, Vladimiro
    Nielsen, Mogens
    FORMAL ASPECTS IN SECURITY AND TRUST, 2010, 5983 : 21 - +
  • [46] An HMM-Based Reputation Model
    ElSalamouny, Ehab
    Sassone, Vladimiro
    ADVANCES IN SECURITY OF INFORMATION AND COMMUNICATION NETWORKS, 2013, 381 : 111 - +
  • [47] Excitation Modeling for HMM-based Speech Synthesis Based on Principal Component Analysis
    Narendra, N. P.
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    2016 TWENTY SECOND NATIONAL CONFERENCE ON COMMUNICATION (NCC), 2016,
  • [48] Syllable HMM based Mandarin TTS and Comparison with Concatenative TTS
    Shuang, Zhiwei
    Kang, Shiyin
    Shi, Qin
    Qin, Yong
    Cai, Lianhong
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1755 - +
  • [49] Excitation Modeling Method Based on Inverse Filtering for HMM-Based Speech Synthesis
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    MACHINE INTELLIGENCE AND SIGNAL ANALYSIS, 2019, 748 : 85 - 91
  • [50] HMM-Based Voice Conversion Using Quantized F0 Context
    Nose, Takashi
    Ota, Yuhei
    Kobayashi, Takao
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2483 - 2490