Intonation and Prosody Conversion for Expressive Mandarin Speech Synthesis

被引:0
|
作者
Zhu, Jing [1 ]
Yu, Yibiao [1 ]
机构
[1] Soochow Univ, Sch Elect & Informat Engn, Suzhou, Peoples R China
关键词
speech synthesis; intonation; prosody; polynomial fitting;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Expressive speech synthesis has a wide variety of applications. Compared with general speech synthesis for Chinese, this paper focuses on prosody and intonation. Prosody is described from three aspects, accent, pause and speaking speed. Accent can be stressed by modifying fundamental frequency and amplitude. Pause is achieved by interpolating some frames which parameter value is zero. Speaking speed is controlled by copying or deleting some frames in specified location. Mandarin is a tonal language, so intonation is significant in the synthesis. There are four patterns of intonation, rising intonation, falling intonation, flat intonation and sinuate intonation. Use polynomial fitting function to model each intonation pattern. Apply the intonation model to convert one pattern to another. It can be seen from the experimental results, the proposed method can achieve a good quality on the conversion of tune and it can highly improve the naturalness of the speech.
引用
收藏
页码:549 / 552
页数:4
相关论文
共 50 条
  • [1] Speech Modification for Prosody Conversion in Expressive Marathi Text-to-Speech Synthesis
    Anil, Manjare Chandraprabha
    Shirbahadurkar, S. D.
    [J]. 2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2014, : 56 - 58
  • [2] Intonation Conversion from Neutral to Expressive Speech
    Veaux, Christophe
    Rodet, Xavier
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2776 - +
  • [3] Prosody Conversion for Emotional Mandarin Speech Synthesis Using the Tone Nucleus Model
    Wen, Miaomiao
    Wang, Miaomiao
    Hirose, Keikichi
    Minematsu, Nobuaki
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2808 - +
  • [4] Prosody modelling of Spanish for expressive speech synthesis
    Iriondo, Ignasi
    Socoro, Joan Claudi
    Alias, Francesc
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 821 - +
  • [5] Expressive Prosody for Unit-selection Speech Synthesis
    Strom, Volker
    Clark, Robert
    King, Simon
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1296 - 1299
  • [6] Quantitative intonation modeling of interrogative sentences for Mandarin speech synthesis
    Li, Ya
    Tao, Jianhua
    Lai, Wei
    Xu, Xiaoying
    [J]. SPEECH COMMUNICATION, 2017, 89 : 92 - 102
  • [7] Prosody Dependent Mandarin Speech Recognition
    Ni, Chong-Jia
    Liu, Wen-Ju
    Xu, Bo
    [J]. 2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 197 - 201
  • [8] PROSODY MODELING FOR MANDARIN EXCLAMATORY SPEECH
    Jia, Huibin
    Tao, Jianhua
    [J]. ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 890 - 893
  • [9] INTERACTIVE MULTI-LEVEL PROSODY CONTROL FOR EXPRESSIVE SPEECH SYNTHESIS
    Cornille, Tobias
    Wang, Fengna
    Bekker, Jessa
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8312 - 8316
  • [10] The Automatic Analysis by Synthesis of Speech Prosody with Preliminary Results on Mandarin Chinese
    Hirst, Daniel
    [J]. 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : XXIV - XXIV