A Pitch-Synchronous Speech Analysis and Synthesis Method for DNN-SPSS System

被引:0
|
作者
Kim, Jin-Seob [1 ]
Joo, Young-Sun [1 ]
Kang, Hong-Goo [1 ]
Jang, Inseon [2 ]
Ahn, ChungHyun [2 ]
Seo, Jeongil [2 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea
[2] Elect & Telecommun Res Inst, Realist Broadcasting Media Res Dept, Daejeon, South Korea
关键词
Deep neural newtork (DNN); statistical parametric speech synthesis (SPSS); pitch-synchronous; glottal closure instants (GCIs); DEEP NEURAL-NETWORKS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a pitch-synchronous deep neural network (DNN)-based statistical parametric speech synthesis (SPSS) system. The pitch-synchronous frames defined by the locations of glottal closure instants (GCIs) are used to extract speech parameters, which significantly reduce coupling effects between vocal tract and excitation signals. As a result, the distribution of spectral parameters within the same context of phonetic classes becomes more uniform, which improves a model trainability especially for a small-scaled DNN framework. Although the effectiveness of pitch-synchronous approach has been proven in other applications, it is not trivial to integrate the method into the typical DNN-based SPSS systems that have regularized structures, i.e. fixed frame rate and fixed dimension of features. In this paper, we design a new DNN-based SPSS system that pitch-synchronously trains and generates speech parameters. Objective and subjective test results verify the superiority of the proposed system compared to the conventional approach.
引用
收藏
页码:408 / 411
页数:4
相关论文
共 50 条
  • [31] A DCT-Based Speech Enhancement System With Pitch Synchronous Analysis
    Ding, Huijun
    Soon, Ing Yann
    Yeo, Chai Kiat
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2614 - 2623
  • [32] A pitch-synchronous peak-amplitude based feature extraction method for noise robust asr
    Ghulam, Muhammad
    Horikawa, Junsei
    Nitta, Tsuneo
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 505 - 508
  • [33] An Investigation of Implementation and Performance Analysis of DNN Based Speech Synthesis System
    Chen, Zhehuai
    Yu, Kai
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 577 - 582
  • [34] A pitch-synchronous extension of fractal additive synthesis via time-varying cosine modulated filter banks
    Polotti, Pietro
    IEEE SIGNAL PROCESSING LETTERS, 2008, 15 : 433 - 436
  • [35] Improving the 2.4 kb/s military standard MELP (MS-MELP) coder using pitch-synchronous analysis and synthesis techniques
    Ertan, AE
    Barnwell, TP
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 761 - 764
  • [36] Psychoacoustical evaluation of the pitch-synchronous overlap-and-add speech-waveform manipulation technique using single-formant stimuli
    Kortekaas, RWL
    Kohlrausch, A
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 101 (04): : 2202 - 2213
  • [37] PITCH SYNCHRONOUS SPECTRAL-ANALYSIS SCHEME FOR VOICED SPEECH
    MEDAN, Y
    YAIR, E
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (09): : 1321 - 1328
  • [38] Pseudo pitch synchronous analysis of speech with applications to speaker recognition
    Zilca, RD
    Kingsbury, B
    Navrátil, J
    Ramaswamy, GN
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 467 - 478
  • [39] Robust Pitch Extraction Method for the HMM-Based Speech Synthesis System
    Reddy, M. Kiran
    Rao, K. Sreenivasa
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (08) : 1133 - 1137
  • [40] Pitch alteration technique in speech synthesis system
    Jung, JS
    Kim, JJ
    Bae, MJ
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2001, 47 (01) : 163 - 167