Automatic generation of prosodic structure for high quality Mandarin speech synthesis

被引:0
|
作者
Chou, FC
Tseng, CY
Lee, LS
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A key problem for today's speech synthesis technology is to automatically generate an appropriate hierarchical prosodic structure for text input and incorporate it into synthesized speech[1][2]. This paper presents a method for such a problem in Mandarin Chinese. This method uses a speech database for the training of a statistical model to generate the prosodic structure and determine prosodic parameters such as syllable duration, pause, energy and intonation. The experimental results show that an accuracy of 83.1% in the prediction of prosodic structure can be achieved. Furthermore, a Chinese text-to speech system on be developed based on the proposed prosodic structure.
引用
收藏
页码:1624 / 1627
页数:4
相关论文
共 50 条
  • [1] Parsing hierarchical prosodic structure for Mandarin speech synthesis
    Xu, Dawei
    Wang, Haifeng
    Li, Guohua
    Kagoshima, Takehiko
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 745 - 748
  • [2] Learning prosodic patterns for mandarin speech synthesis
    Chen, YQ
    Gao, W
    Zhu, TS
    Ling, C
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2002, 19 (01) : 95 - 109
  • [3] Learning Prosodic Patterns for Mandarin Speech Synthesis
    Yiqiang Chen
    Wen Gao
    Tingshao Zhu
    Charles Ling
    [J]. Journal of Intelligent Information Systems, 2002, 19 : 95 - 109
  • [4] A novel method for Mandarin speech synthesis by inserting prosodic structure prediction into Tacotron2
    Junmin Liu
    Zhuangzhuang Xie
    Chunxia Zhang
    Guang Shi
    [J]. International Journal of Machine Learning and Cybernetics, 2021, 12 : 2809 - 2823
  • [5] A novel method for Mandarin speech synthesis by inserting prosodic structure prediction into Tacotron2
    Liu, Junmin
    Xie, Zhuangzhuang
    Zhang, Chunxia
    Shi, Guang
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (10) : 2809 - 2823
  • [6] Automatic generation of pronunciation lexicons for Mandarin spontaneous speech
    Byrne, W
    Venkataramani, V
    Kamm, T
    Zheng, TF
    Song, Z
    Fung, P
    Liu, Y
    Ruhi, U
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 569 - 572
  • [7] Prosodic Processing for the Automatic Synthesis of Emotional Russian Speech
    Kaliyev, Arman
    Matveev, Yuri N.
    Lyakso, Elena E.
    Rybin, Sergey V.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE QUALITY MANAGEMENT, TRANSPORT AND INFORMATION SECURITY, INFORMATION TECHNOLOGIES (IT&QM&IS), 2018, : 653 - 655
  • [8] Exploration of high-level prosodic patterns for continuous Mandarin speech
    Chiang, Chen-Yu
    Yu, Hsiu-Min
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3977 - +
  • [9] Prosodic Cues in Polite and Rude Mandarin Speech
    Fan, Ping
    Gu, Wentao
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [10] TREE-BASED APPROACHES TO AUTOMATIC-GENERATION OF SPEECH SYNTHESIS RULES FOR PROSODIC PARAMETERS
    YAMASHITA, Y
    TANAKA, M
    AMAKO, Y
    NOMURA, Y
    OHTA, Y
    KITOH, A
    KAKUSHO, O
    MIZOGUCHI, R
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1934 - 1941