Statistical parametric speech synthesis with a novel codebook-based excitation model

被引:2
|
作者
Csapo, Tamas Gabor [1 ]
Nemeth, Geza [1 ]
机构
[1] Budapest Univ Technol & Econ, Dept Telecommun & Media Informat, Budapest, Hungary
来源
关键词
Text-to-speech synthesis; speech processing; excitation model; vocoding; parametric;
D O I
10.3233/IDT-140197
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech synthesis is an important modality in Cognitive Infocommunications, which is the intersection of informatics and cognitive sciences. Statistical parametric methods have gained importance in speech synthesis recently. The speech signal is decomposed to parameters and later restored from them. The decomposition is implemented by speech coders. We apply a novel codebook-based speech coding method to model the excitation of speech. In the analysis stage the speech signal is analyzed frame-by-frame and a codebook of pitch synchronous excitations is built from the voiced parts. Timing, gain and harmonic-tonoise ratio parameters are extracted and fed into the machine learning stage of Hidden Markov-model based speech synthesis. During the synthesis stage the codebook is searched for a suitable element in each voiced frame and these are concatenated to create the excitation signal, from which the final synthesized speech is created. Our initial experiments show that the model fits well in the statistical parametric speech synthesis framework and in most cases it can synthesize speech in a better quality than the traditional pulse-noise excitation. (This paper is an extended version of [10].)
引用
收藏
页码:289 / 299
页数:11
相关论文
共 50 条
  • [1] A novel codebook-based excitation model for use in speech synthesis
    Csapo, Tamas Gabor
    Nemeth, Geza
    [J]. 3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 661 - 665
  • [2] Modeling Irregular Voice in Statistical Parametric Speech Synthesis With Residual Codebook Based Excitation
    Csapo, Tamas Gabor
    Nemeth, Geza
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 209 - 220
  • [3] Improved Codebook-based Speech Enhancement based on MBE Model
    Huang, Qizheng
    Bao, Changchun
    Wang, Xianyun
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3627 - 3631
  • [4] Codebook-based Bayesian speech enhancement
    Srinivasan, S
    Samuelsson, J
    Kleijn, WB
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1077 - 1080
  • [5] Codebook-based Bayesian speech enhancement for nonstationary environments
    Srinivasan, Sriram
    Samuelsson, Jonas
    Kleijn, W. Bastiaan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02): : 441 - 452
  • [6] Binaural Codebook-Based Speech Enhancement With Atomic Speech Presence Probability
    Wood, Sean U. N.
    Stahl, Johannes K. W.
    Mowlaee, Pejman
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2150 - 2161
  • [7] Codebook-based Speech Enhancement with Bayesian LP Parameters Estimation
    Wang, Qing
    Bao, Chang-chun
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1245 - 1248
  • [8] Blind Bandwidth Extension for Codebook-based Bayesian Speech Enhancement
    Li, Yaxing
    Kim, Jonghyeon
    Kang, Sangwon
    [J]. 18TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS (ISCE 2014), 2014,
  • [9] Speech Enhancement in Modulation Domain Using Codebook-based Speech and Noise Estimation
    Mani, Vidhyasagar
    Champagne, Benoit
    Zhu, Wei-Ping
    [J]. 2015 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2015, : 707 - 711
  • [10] Real-Time Codebook-based Speech Enhancement with GPUs
    Prasanna, A. N. Sai
    Gurumurthyt, Iver Chandrashekaran
    Naidu, D. H. R.
    Baruith, Pallav Kuniar
    [J]. 2014 INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2014, : 306 - 311