A Pitch-Synchronous Speech Analysis and Synthesis Method for DNN-SPSS System

被引:0
|
作者
Kim, Jin-Seob [1 ]
Joo, Young-Sun [1 ]
Kang, Hong-Goo [1 ]
Jang, Inseon [2 ]
Ahn, ChungHyun [2 ]
Seo, Jeongil [2 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea
[2] Elect & Telecommun Res Inst, Realist Broadcasting Media Res Dept, Daejeon, South Korea
关键词
Deep neural newtork (DNN); statistical parametric speech synthesis (SPSS); pitch-synchronous; glottal closure instants (GCIs); DEEP NEURAL-NETWORKS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a pitch-synchronous deep neural network (DNN)-based statistical parametric speech synthesis (SPSS) system. The pitch-synchronous frames defined by the locations of glottal closure instants (GCIs) are used to extract speech parameters, which significantly reduce coupling effects between vocal tract and excitation signals. As a result, the distribution of spectral parameters within the same context of phonetic classes becomes more uniform, which improves a model trainability especially for a small-scaled DNN framework. Although the effectiveness of pitch-synchronous approach has been proven in other applications, it is not trivial to integrate the method into the typical DNN-based SPSS systems that have regularized structures, i.e. fixed frame rate and fixed dimension of features. In this paper, we design a new DNN-based SPSS system that pitch-synchronously trains and generates speech parameters. Objective and subjective test results verify the superiority of the proposed system compared to the conventional approach.
引用
收藏
页码:408 / 411
页数:4
相关论文
共 50 条
  • [1] A PITCH-SYNCHRONOUS ANALYSIS OF HOARSENESS IN RUNNING SPEECH
    MUTA, H
    BAER, T
    WAGATSUMA, K
    MURAOKA, T
    FUKUDA, H
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 84 (04): : 1292 - 1301
  • [2] A Naxi speech synthesis system based on Pitch-Synchronous Overlap-Add
    Yang, J
    Pu, YY
    Liu, B
    ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 654 - 657
  • [3] NOTE ON PITCH-SYNCHRONOUS PROCESSING OF SPEECH
    DAVID, EE
    MCDONALD, HS
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (06): : 1261 - 1266
  • [4] A PITCH-SYNCHRONOUS ANALYSIS SYNTHESIS SYSTEM TO INDEPENDENTLY MODIFY FORMANT FREQUENCIES AND BANDWIDTHS FOR VOICED SPEECH
    KUWABARA, H
    SPEECH COMMUNICATION, 1984, 3 (03) : 211 - 220
  • [5] A NOTE ON PITCH-SYNCHRONOUS PROCESSING OF SPEECH
    DAVID, EE
    MCDONALD, HS
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (01): : 159 - 159
  • [6] Naxi speech synthesis system based on pitch-synchronous overlap-add
    Yang, Jian
    Pu, Yuanyuan
    Liu, Bing
    International Conference on Signal Processing Proceedings, ICSP, 1998, 1 : 654 - 657
  • [7] PITCH-SYNCHRONOUS DIGITAL FEATURE EXTRACTION SYSTEM FOR PHONEMIC RECOGNITION OF SPEECH
    HESS, WJ
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (01): : 14 - 25
  • [8] Pitch-synchronous speech signal segmentation and its applications
    Petrushin, VA
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 321 - 326
  • [9] PITCH-SYNCHRONOUS WAVELET REPRESENTATIONS OF SPEECH AND MUSIC SIGNALS
    EVANGELISTA, G
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (12) : 3313 - 3330
  • [10] A MODIFIED AUTO-CORRELATION METHOD OF LINEAR PREDICTION FOR PITCH-SYNCHRONOUS ANALYSIS OF VOICED SPEECH
    PALIWAL, KK
    RAO, PVS
    SIGNAL PROCESSING, 1981, 3 (02) : 181 - 185