Progressive Neural Networks based Features Prediction for the Target Cost in Unit-Selection Speech Synthesizer

被引:0
|
作者
Fu, Ruibo [1 ,2 ]
Tao, Jianhua [1 ,2 ,3 ]
Wen, Zhengqi [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] CAS Ctr Excellence Brain Sci & Intelligence Techn, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
speech synthesis; unit-selection; target cost; progressive neural networks;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper describes a direct acoustic features prediction for calculation of the target cost by progressive neural networks. Compared with conventional methods involving many hand-tuning steps, our method directly predicts the features for calculation of the target cost. By applying the progressive deep neural network (PDNN) to predict these acoustic features, the correlation of these features can be modeled. Each type of the acoustic features and each part of a unit are modeled in different sub-networks with its own cost function and the knowledge transfers through lateral connections. Each sub-network in the PDNN can be trained to reach its own optimum step by step. Extensive comparative evaluations demonstrate the effectiveness of the PDNN in improving the accuracy of predicted acoustic features. The subjective evaluation results demonstrate that the naturalness of synthetic speech has been improved by adopting the proposed method to calculate the target cost.
引用
收藏
页码:504 / 509
页数:6
相关论文
共 50 条
  • [41] Design of progressive fuzzy polynomial neural networks through gated recurrent unit structure and correlation/probabilistic selection strategies
    Wang, Zhen
    Oh, Sung-Kwun
    Wang, Zheng
    Fu, Zunwei
    Pedrycz, Witold
    Yoon, Jin Hee
    [J]. FUZZY SETS AND SYSTEMS, 2023, 470
  • [42] Speech enhancement from fused features based on deep neural network and gated recurrent unit network
    Wang, Youming
    Han, Jiali
    Zhang, Tianqi
    Qing, Didi
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2021, 2021 (01)
  • [43] Speech enhancement from fused features based on deep neural network and gated recurrent unit network
    Youming Wang
    Jiali Han
    Tianqi Zhang
    Didi Qing
    [J]. EURASIP Journal on Advances in Signal Processing, 2021
  • [44] Drug-Target Interactions Prediction Based on Signed Heterogeneous Graph Neural Networks
    Chen, Ming
    Jiang, Yajian
    Lei, Xiujuan
    Pan, Yi
    Ji, Chunyan
    Jiang, Wei
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2024, 33 (01) : 231 - 244
  • [45] Drug-Target Interactions Prediction Based on Signed Heterogeneous Graph Neural Networks
    Ming CHEN
    Yajian JIANG
    Xiujuan LEI
    Yi PAN
    Chunyan JI
    Wei JIANG
    [J]. Chinese Journal of Electronics, 2024, 33 (01) : 231 - 244
  • [46] Drug-target interaction prediction based on protein features, using wrapper feature selection
    Abbasi Mesrabadi, Hengame
    Faez, Karim
    Pirgazi, Jamshid
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)
  • [47] Lossless Image Compression Based on Image Decomposition and Progressive Prediction Using Convolutional Neural Networks
    Shim, Jae Hoon
    Rhee, Hochang
    Jang, Yeong Il
    Lee, Geonsu
    Kim, Seyun
    Cho, Nam Ik
    [J]. 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 158 - 163
  • [48] Whisper to Normal Speech Based on Deep Neural Networks with MCC and F0 Features
    Lian, Hailun
    Hu, Yuting
    Zhou, Jian
    Wang, Huabin
    Tao, Liang
    [J]. 2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [49] Construction of a Knowledge Map of Speech Emotion Features Based on Impulse-Coupled Neural Networks
    Song, Yan
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [50] Stock prediction based on bidirectional gated recurrent unit with convolutional neural network and feature selection
    Zhou, Qihang
    Zhou, Changjun
    Wang, Xiao
    [J]. PLOS ONE, 2022, 17 (02):