Superpositional HMM-based intonation synthesis using a functional F0 model

被引：0

作者：

Ni, Jinfu ^{[1
]}

Shiga, Yoshinori ^{[1
]}

Hori, Chiori ^{[1
]}

机构：

[1] Natl Inst Informat & Commun Technol, Spoken Language Commun Lab, Univ Commun Res Inst, Kyoto, Japan

来源：

2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2014年

关键词：

Intonation synthesis; HMM-based speech synthesis; functional F0 model; making focal prominence; prosody; AUTOMATIC EXTRACTION; SPEECH;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper addresses intonation synthesis combining statistical and functional approach with manipulation of fundamental frequency (F-0) contours in HMM-based speech synthesis. An F-0 contour is represented as a sum of micro, accent, and register components at the logarithmic scale, which is rooted in the Fujisaki model. Separated context-dependent (CD) HMMs are trained for each type of components extracted from a speech corpus based on a functional F-0 model. At the phase of synthesis, CDHMM-generated micro, accent, and register components are superimposed to form F-0 contours for input text. Objective and subjective evaluations are carried out on a Japanese speech corpus. Compared with the conventional approach, this method not only demonstrates the improved performance in naturalness of synthetic speech by achieving better global F-0 behaviors but also shows its flexibility for intonation manipulation through modifying the functional model parameters.

引用

页码：270 / 274

页数：5

共 50 条

[1] Superpositional HMM-Based Intonation Synthesis Using a Functional F0 Model
Ni, Jinfu
Shiga, Yoshinori
Hori, Chiori
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (02): : 273 - 286
[2] Superpositional HMM-Based Intonation Synthesis Using a Functional F0 Model
Jinfu Ni
Yoshinori Shiga
Chiori Hori
Journal of Signal Processing Systems, 2016, 82 : 273 - 286
[3] Investigation of Prosodic F0 Layers in Hierarchical F0 Modeling for HMM-based Speech Synthesis
Lei, Ming
Wu, Yi-Jian
Ling, Zhen-Hua
Dai, Li-Rong
2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 613 - +
[4] Asynchronous F0 and Spectrum Modeling for HMM-Based Speech Synthesis
Wang, Cheng-Cheng
Ling, Zhen-Hua
Dai, Li-Rong
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 412 - 415
[5] A Hierarchical F0 Modeling Method for HMM-based Speech Synthesis
Lei, Ming
Wu, Yi-Jian
Soong, Frank K.
Ling, Zhen-Hua
Dai, Li-Rong
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2170 - +
[6] HMM-Based Voice Conversion Using Quantized F0 Context
Nose, Takashi
Ota, Yuhei
Kobayashi, Takao
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2483 - 2490
[7] Context-dependent additive log F0 model for HMM-based speech synthesis
Zen, Heiga
Braunschweiler, Norbert
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2039 - 2042
[8] CROSS-STREAM DEPENDENCY MODELING USING CONTINUOUS F0 MODEL FOR HMM-BASED SPEECH SYNTHESIS
Wang, Xin
Ling, Zhen-Hua
Dai, Li-Rong
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 84 - 87
[9] Soft context clustering for F0 modeling in HMM-based speech synthesis
Khorram, Soheil
Sameti, Hossein
King, Simon
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
[10] Soft context clustering for F0 modeling in HMM-based speech synthesis
Soheil Khorram
Hossein Sameti
Simon King
EURASIP Journal on Advances in Signal Processing, 2015

← 1 2 3 4 5 →