A Pitch-Synchronous Speech Analysis and Synthesis Method for DNN-SPSS System

被引:0
|
作者
Kim, Jin-Seob [1 ]
Joo, Young-Sun [1 ]
Kang, Hong-Goo [1 ]
Jang, Inseon [2 ]
Ahn, ChungHyun [2 ]
Seo, Jeongil [2 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea
[2] Elect & Telecommun Res Inst, Realist Broadcasting Media Res Dept, Daejeon, South Korea
关键词
Deep neural newtork (DNN); statistical parametric speech synthesis (SPSS); pitch-synchronous; glottal closure instants (GCIs); DEEP NEURAL-NETWORKS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a pitch-synchronous deep neural network (DNN)-based statistical parametric speech synthesis (SPSS) system. The pitch-synchronous frames defined by the locations of glottal closure instants (GCIs) are used to extract speech parameters, which significantly reduce coupling effects between vocal tract and excitation signals. As a result, the distribution of spectral parameters within the same context of phonetic classes becomes more uniform, which improves a model trainability especially for a small-scaled DNN framework. Although the effectiveness of pitch-synchronous approach has been proven in other applications, it is not trivial to integrate the method into the typical DNN-based SPSS systems that have regularized structures, i.e. fixed frame rate and fixed dimension of features. In this paper, we design a new DNN-based SPSS system that pitch-synchronously trains and generates speech parameters. Objective and subjective test results verify the superiority of the proposed system compared to the conventional approach.
引用
收藏
页码:408 / 411
页数:4
相关论文
共 50 条
  • [41] Pitch alteration technique in a speech synthesis system
    Jang, KY
    Kim, JJ
    Bae, MJ
    IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - 2000 DIGEST OF TECHNICAL PAPERS, 2000, : 332 - 333
  • [42] Speech quality evaluation for different pitch detection algorithms in LPC speech analysis–synthesis system
    Sandeep Kumar
    Sneha Singh
    Prabhakar Agarwal
    Upendra Kumar Acharya
    Prabira Kumar Sethy
    Chanki Pandey
    International Journal of Speech Technology, 2021, 24 : 545 - 551
  • [43] Classification of Parkinson's disease Using Pitch Synchronous Speech Analysis
    Appakaya, Sai Bharadwaj
    Sankar, Ravi
    2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 1420 - 1423
  • [44] Pitch synchronous addition and extension for linear predictive analysis of noisy speech
    Shimamura, T
    NORSIG 2004: PROCEEDINGS OF THE 6TH NORDIC SIGNAL PROCESSING SYMPOSIUM, 2004, 46 : 196 - 199
  • [45] Pitch synchronous and glottal closure based speech analysis for language recognition
    Rao, K.
    Maity, Sudhamay
    Reddy, V.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2013, 16 (04) : 413 - 430
  • [46] Speech quality evaluation for different pitch detection algorithms in LPC speech analysis-synthesis system
    Kumar, Sandeep
    Singh, Sneha
    Agarwal, Prabhakar
    Acharya, Upendra Kumar
    Sethy, Prabira Kumar
    Pandey, Chanki
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (03) : 545 - 551
  • [47] Pitch-synchronous peak-amplitude (PS-PA)-based feature extraction method for noise-robust ASR
    Ghulam, Muhammad
    Katsurada, Kouichi
    Horikawa, Junsei
    Nitta, Tsuneo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (11) : 2766 - 2774
  • [48] Analysis by synthesis speech coding with generalized pitch prediction
    Mermelstein, P
    Qian, YS
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 1 - 4
  • [49] Pink Noise Whitening Method for Pitch Synchronous LPC Analysis
    Liu, Liqing
    Shimamura, Tetsuya
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [50] A Method for Measuring the Pitch Frequency of Speech Signals for the Systems of Acoustic Speech Analysis
    A. V. Savchenko
    V. V. Savchenko
    Measurement Techniques, 2019, 62 : 282 - 288