A novel prosodic-information synthesizer based on recurrent fuzzy neural network for the Chinese TTS system

被引:11
|
作者
Lin, CT [1 ]
Wu, RC [1 ]
Chang, JY [1 ]
Liang, SF [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Control Engn, Hsinchu 300, Taiwan
关键词
Chinese text-to-speech system; fuzzy inference engine; prosodic information; recurrent neural network; sandhi rules; speech synthesizer;
D O I
10.1109/TSMCB.2003.811518
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a new technique for the Chinese text-to-speech (TTS) system is proposed. Our major effort focuses on the prosodic information generation. New methodologies for constructing fuzzy rules in a prosodic model simulating human's pronouncing rules are developed. The proposed Recurrent Fuzzy Neural Network (RFNN) is a multilayer recurrent neural network (RNN) which integrates a Self-constructing Neural Fuzzy Inference Network (SONFIN) into a recurrent connectionist structure. The RFNN can be functionally divided into two,parts. The first part adopts the SONFIN as a prosodic model to explore the relationship between high-level linguistic features and prosodic information based on fuzzy inference rules. As compared to conventional neural networks, the SONFIN can always construct itself with an economic network size in high learning speed. The second part employs a five-layer network to generate all prosodic parameters by directly using the prosodic fuzzy rules inferred from the first part as well as other important features of syllables. The TTS system combined with the proposed method can behave not only sandhi rules but also the other prosodic phenomena existing in the traditional TTS systems. Moreover, the proposed scheme can even find out some new rules about prosodic phrase structure. The performance of the proposed RFNN-based prosodic model is verified by imbedding it into a Chinese TTS system with a Chinese monosyllable database based on the time-domain pitch synchronous overlap add (TD-PSOLA) method. Our experimental results show that the proposed RFNN can generate proper prosodic parameters including pitch means, pitch shapes, maximum energy levels, syllable duration, and pause duration. Some synthetic sounds are on-line available for demonstration.
引用
下载
收藏
页码:309 / 324
页数:16
相关论文
共 50 条
  • [1] Enhance the word vector with prosodic information for the recurrent neural network based TTS system
    Wang, Xin
    Takaki, Shinji
    Yamagishi, Junichi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2856 - 2860
  • [2] WORD EMBEDDING FOR RECURRENT NEURAL NETWORK BASED TTS SYNTHESIS
    Wang, Peilu
    Qian, Yao
    Soong, Frank K.
    He, Lei
    Zhao, Hai
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4879 - 4883
  • [3] Recurrent fuzzy neural network based system for battery charging
    Aliev, R. A.
    Aliev, R. R.
    Guirimov, B. G.
    Uyar, K.
    ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 2, PROCEEDINGS, 2007, 4492 : 307 - +
  • [4] Integrating Prosodic Information into Recurrent Neural Network Language Model For Speech Recognition
    Fu, Tong
    Han, Yang
    Li, Xiangang
    Liu, Yi
    Wu, Xihong
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1194 - 1197
  • [5] A fuzzy neural network based evaluation approach for information system
    Ma, Wei-Min
    Ma, Xiu-Juan
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 2874 - 2879
  • [6] A novel recurrent neural network based prediction system for trading
    Quek, Chai
    Pasquier, Michel
    Kumar, Neha
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 2090 - +
  • [7] A Novel Greenhouse Control System Based on Fuzzy Neural Network
    Zhang, Xinyi
    MECHANICAL COMPONENTS AND CONTROL ENGINEERING III, 2014, 668-669 : 415 - 418
  • [8] Research on Risk Assessment of Information System Based on Fuzzy Neural Network
    Zhu, Guangliang
    Wang, Yuanbao
    PROCEEDINGS OF THE INTERNATIONAL ACADEMIC CONFERENCE ON FRONTIERS IN SOCIAL SCIENCES AND MANAGEMENT INNOVATION (IAFSM 2018), 2018, 62 : 50 - 55
  • [9] A Telecommunications Call Volume Forecasting System based on a Recurrent Fuzzy Neural Network
    Mastorocostas, Paris A.
    Hilas, Constantinos S.
    Varsamis, Dimitris N.
    Dova, Stergiani C.
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [10] A compensation-based recurrent fuzzy neural network for dynamic system identification
    Lin, CJ
    Chen, CH
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2006, 172 (02) : 696 - 715