Trainable prosodic model for standard Chinese Text-to-Speech system

被引:0
|
作者
TAO Jianhua
机构
基金
中国国家自然科学基金;
关键词
Trainable prosodic model for standard Chinese Text-to-Speech system; Text;
D O I
10.15949/j.cnki.0217-9776.2001.03.007
中图分类号
H017 [实验语音学(仪器语音学)];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Putonghua prosody is characterized by its hierarchical structure when influenced by linguistic environments. Based on this, a neural network, with specially weighted factors and optimizing outputs, is described and applied to construct the Putonghua prosodic model in Text-to-Speech (TTS) system. Extensive tests show that the structure of the neural network characterizes the Putonghua prosody more exactly than traditional models. Learning rate is speeded up and computational precision is improved, which makes the whole prosodic model more efficient. Furthermore, the paper also stylizes the Putonghua syllable pitch contours with SPiS parameters (Syllable Pitch Stylized Parameters), and analyzes them in adjusting the syllable pitch. It shows that the SPiS parameters effectively characterize the Putonghua syllable pitch contours, and facilitate the establishment of the network model and the prosodic controlling.
引用
收藏
页码:257 / 265
页数:9
相关论文
共 50 条
  • [11] IMPLEMENTING PROSODIC PHRASING FOR AN EXPERIMENTAL TEXT-TO-SPEECH SYSTEM
    BACHENKO, J
    FITZPATRICK, E
    LACY, J
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1987, 81 : S79 - S79
  • [12] A Prosodic Mandarin Text-to-Speech System Based on Tacotron
    Zhang, Chuxiong
    Zhang, Sheng
    Zhong, Haibing
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 165 - 169
  • [13] A prosodic Turkish text-to-speech synthesizer
    Vural, E
    Oflazer, K
    [J]. PROCEEDINGS OF THE IEEE 12TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, 2004, : 458 - 460
  • [14] A prosodic diphone database for Korean text-to-speech synthesis system
    Yoon, K
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 425 - 428
  • [15] Recent improvements on Microsoft's trainable Text-to-Speech system - Whistler
    Huang, X
    Acero, A
    Hon, H
    Ju, Y
    Liu, J
    Meredith, S
    Plumpe, M
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 959 - 962
  • [16] Increasing Prosodic Variability of Text-To-Speech Synthesizers
    Nemeth, Geza
    Fek, Mark
    Csapo, Tamas Gabor
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1981 - 1984
  • [17] ON GRANULARITY OF PROSODIC REPRESENTATIONS IN EXPRESSIVE TEXT-TO-SPEECH
    Babianski, Mikolaj
    Pokora, Kamil
    Shah, Raahil
    Sienkiewicz, Rafal
    Korzekwa, Daniel
    Klimkov, Viacheslav
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 892 - 899
  • [18] THE SYNTHESIS RULES IN A CHINESE TEXT-TO-SPEECH SYSTEM
    LEE, LS
    TSENG, CY
    MING, OY
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (09): : 1309 - 1320
  • [19] Research on prosodic features and their prediction issues in Uyghur Text-to-Speech System
    Hamdulla, Askar
    Rozi, Askar
    Eli, Gulnar
    Tursun, Dilmurat
    [J]. PROCEEDINGS OF THE 2009 PACIFIC-ASIA CONFERENCE ON CIRCUITS, COMMUNICATIONS AND SYSTEM, 2009, : 257 - 260
  • [20] A Chinese text-to-speech system based on part-of-speech analysis, prosodic modeling and non-uniform units
    Chou, FC
    Tseng, CY
    Chen, KJ
    Lee, LS
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS, 1997, : 923 - 926