A Pitch-Synchronous Speech Analysis and Synthesis Method for DNN-SPSS System

被引：0

作者：

Kim, Jin-Seob ^{[1
]}

Joo, Young-Sun ^{[1
]}

Kang, Hong-Goo ^{[1
]}

Jang, Inseon ^{[2
]}

Ahn, ChungHyun ^{[2
]}

Seo, Jeongil ^{[2
]}

机构：

[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea

[2] Elect & Telecommun Res Inst, Realist Broadcasting Media Res Dept, Daejeon, South Korea

来源：

2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP) | 2016年

关键词：

Deep neural newtork (DNN); statistical parametric speech synthesis (SPSS); pitch-synchronous; glottal closure instants (GCIs); DEEP NEURAL-NETWORKS;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper proposes a pitch-synchronous deep neural network (DNN)-based statistical parametric speech synthesis (SPSS) system. The pitch-synchronous frames defined by the locations of glottal closure instants (GCIs) are used to extract speech parameters, which significantly reduce coupling effects between vocal tract and excitation signals. As a result, the distribution of spectral parameters within the same context of phonetic classes becomes more uniform, which improves a model trainability especially for a small-scaled DNN framework. Although the effectiveness of pitch-synchronous approach has been proven in other applications, it is not trivial to integrate the method into the typical DNN-based SPSS systems that have regularized structures, i.e. fixed frame rate and fixed dimension of features. In this paper, we design a new DNN-based SPSS system that pitch-synchronously trains and generates speech parameters. Objective and subjective test results verify the superiority of the proposed system compared to the conventional approach.

引用

页码：408 / 411

页数：4

共 50 条

[1] A PITCH-SYNCHRONOUS ANALYSIS OF HOARSENESS IN RUNNING SPEECH
MUTA, H
BAER, T
WAGATSUMA, K
MURAOKA, T
FUKUDA, H
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 84 (04): : 1292 - 1301
[2] A Naxi speech synthesis system based on Pitch-Synchronous Overlap-Add
Yang, J
Pu, YY
Liu, B
ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 654 - 657
[3] NOTE ON PITCH-SYNCHRONOUS PROCESSING OF SPEECH
DAVID, EE
MCDONALD, HS
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (06): : 1261 - 1266
[4] A PITCH-SYNCHRONOUS ANALYSIS SYNTHESIS SYSTEM TO INDEPENDENTLY MODIFY FORMANT FREQUENCIES AND BANDWIDTHS FOR VOICED SPEECH
KUWABARA, H
SPEECH COMMUNICATION, 1984, 3 (03) : 211 - 220
[5] A NOTE ON PITCH-SYNCHRONOUS PROCESSING OF SPEECH
DAVID, EE
MCDONALD, HS
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (01): : 159 - 159
[6] Naxi speech synthesis system based on pitch-synchronous overlap-add
Yang, Jian
Pu, Yuanyuan
Liu, Bing
International Conference on Signal Processing Proceedings, ICSP, 1998, 1 : 654 - 657
[7] PITCH-SYNCHRONOUS DIGITAL FEATURE EXTRACTION SYSTEM FOR PHONEMIC RECOGNITION OF SPEECH
HESS, WJ
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (01): : 14 - 25
[8] Pitch-synchronous speech signal segmentation and its applications
Petrushin, VA
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 321 - 326
[9] PITCH-SYNCHRONOUS WAVELET REPRESENTATIONS OF SPEECH AND MUSIC SIGNALS
EVANGELISTA, G
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (12) : 3313 - 3330
[10] A MODIFIED AUTO-CORRELATION METHOD OF LINEAR PREDICTION FOR PITCH-SYNCHRONOUS ANALYSIS OF VOICED SPEECH
PALIWAL, KK
RAO, PVS
SIGNAL PROCESSING, 1981, 3 (02) : 181 - 185

← 1 2 3 4 5 →