F0 Modeling for Isarn Speech Synthesis using Deep Neural Networks and Syllable-level Feature Representation

被引：1

作者：

Janyoi, Pongsathon ^{[1
]}

Seresangtakul, Pusadee ^{[2
]}

机构：

[1] Khon Kaen Univ, Dept Comp Sci, Nat Language & Speech Proc Lab, Khon Kaen, Thailand

[2] Khon Kaen Univ, Dept Comp Sci, Fac Sci, Khon Kaen, Thailand

来源：

INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY | 2020年 / 17卷 / 06期

关键词：

Fundamental frequency; speech synthesis; deep neural networks; HMM; GENERATION;

D O I：

10.34028/iajit/17/6/9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The generation of the fundamental frequency (F-0) plays an important role in speech synthesis, which directly influences the naturalness of synthetic speech. In conventional parametric speech synthesis, F-0 is predicted frame-by-frame. This method is insufficient to represent F-0 contours in larger units, especially tone contours of syllables in tonal languages that deviate as a result of long-term context dependency. This work proposes a syllable-level F-0 model that represents F-0 contours within syllables, using syllable-level F-0 parameters that comprise the sampling F-0 points and dynamic features. A Deep Neural Network (DNN) was used to represent the relationships between syllable-level contextual features and syllable-level F-0 parameters. The proposed model was examined using an Isarn speech synthesis system with both large and small training sets. For all training sets, the results of objective and subjective tests indicate that the proposed approach outperforms the baseline systems based on hidden Markov models and DNNS that predict F-0 values at the frame level.

引用

页码：906 / 915

页数：10

共 50 条

[1] Improving F0 Prediction Using Bidirectional Associative Memories and Syllable-Level F0 Features for HMM-based Mandarin Speech Synthesis
Gao, Li
Ling, Zhen-Hua
Chen, Ling-Hui
Dai, Li-Rong
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 275 - 279
[2] Tonal Contour Generation for Isarn Speech Synthesis Using Deep Learning and Sampling-Based F0 Representation
Janyoi, Pongsathon
Seresangtakul, Pusadee
APPLIED SCIENCES-BASEL, 2020, 10 (18):
[3] Modeling F0 trajectories in hierarchically structured deep neural networks
Yin, Xiang
Lei, Ming
Qian, Yao
Soong, Frank K.
He, Lei
Ling, Zhen-Hua
Dai, Li-Rong
SPEECH COMMUNICATION, 2016, 76 : 82 - 92
[4] Whisper to Normal Speech Based on Deep Neural Networks with MCC and F0 Features
Lian, Hailun
Hu, Yuting
Zhou, Jian
Wang, Huabin
Tao, Liang
2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
[5] F0 Modeling in HMM-Based Speech Synthesis System using Deep Belief Network
Mukherjee, Sankar
Mandal, Shyamal Kumar Das
2014 17TH ORIENTAL CHAPTER OF THE INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDIZATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (COCOSDA), 2014,
[6] Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Features
Ben Alex, Starlet
Mary, Leena
Babu, Ben P.
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (11) : 5681 - 5709
[7] Additive modeling of English F0 contour for speech synthesis
Sakai, S
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 277 - 280
[8] Attention and Feature Selection for Automatic Speech Emotion Recognition Using Utterance and Syllable-Level Prosodic Features
Starlet Ben Alex
Leena Mary
Ben P. Babu
Circuits, Systems, and Signal Processing, 2020, 39 : 5681 - 5709
[9] Investigation of Prosodic F0 Layers in Hierarchical F0 Modeling for HMM-based Speech Synthesis
Lei, Ming
Wu, Yi-Jian
Ling, Zhen-Hua
Dai, Li-Rong
2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 613 - +
[10] Emotional Voice Conversion Using Deep Neural Networks with MCC and F0 Features
Luo, Zhaojie
Takiguchi, Tetsuya
Ariki, Yasuo
2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 977 - 981

← 1 2 3 4 5 →