Excitation Modeling for HMM-based Speech Synthesis Based on Principal Component Analysis

被引:0
|
作者
Narendra, N. P. [1 ]
Reddy, M. Kiran [1 ]
Rao, K. Sreenivasa [1 ]
机构
[1] Indian Inst Technol, Sch Informat Technol, Kharagpur 721302, W Bengal, India
关键词
SYNTHESIS SYSTEM; EXTRACTION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new excitation modeling method for improving the quality of HMM-based speech synthesis. The proposed excitation or source modeling method models the pitch-synchronous residual frames extracted from the excitation signal. Initially, principal component analysis is performed on the pitch- synchronous residual frames. Based on the analysis, the pitch synchronous residual frames are parameterized in two stages. In first stage, the dominant component of the residual frame is represented using PCA coefficients and in the second stage, the noise component of the residual frame is parameterized in terms of spectral and amplitude envelopes. The proposed excitation model is integrated into HMM-based speech synthesis system. Subjective evaluation results indicate that the speech synthesized by the proposed excitation model is significantly better than the two existing excitation modeling methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Excitation Modeling Based on Waveform Interpolation for HMM-based Speech Synthesis
    Sung, June Sig
    Hong, Doo Hwa
    Oh, Kyung Hwan
    Kim, Nam Soo
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 813 - 816
  • [2] Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis
    Sung, June Sig
    Hong, Doo Hwa
    Koo, Hyun Woo
    Kim, Nam Soo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (02): : 379 - 382
  • [3] Excitation Modeling Method Based on Inverse Filtering for HMM-Based Speech Synthesis
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    [J]. MACHINE INTELLIGENCE AND SIGNAL ANALYSIS, 2019, 748 : 85 - 91
  • [4] A trainable excitation model for HMM-based speech synthesis
    Maia, R.
    Toda, T.
    Zen, H.
    Nankaku, Y.
    Tokuda, K.
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
  • [5] Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis
    Wen, Zhengqi
    Tao, Jianhua
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1426 - 1429
  • [6] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [7] Analysis of HMM-Based Lombard Speech Synthesis
    Raitio, Tuomo
    Suni, Antti
    Vainio, Martti
    Alku, Paavo
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2792 - +
  • [8] Two-band excitation for HMM-based speech synthesis
    Kim, Sang-Jin
    Hahn, Minsoo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (01): : 378 - 381
  • [9] Improved Training of Excitation for HMM-based Parametric Speech Synthesis
    Shiga, Yoshinori
    Toda, Tomoki
    Sakai, Shinsuke
    Kawai, Hisashi
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 809 - 812
  • [10] Inverse filter based excitation model for HMM-based speech synthesis system
    Reddy, Mittapalle Kiran
    Rao, Krothapalli Sreenivasa
    [J]. IET SIGNAL PROCESSING, 2018, 12 (04) : 544 - 548