Intonation and Prosody Conversion for Expressive Mandarin Speech Synthesis

被引:0
|
作者
Zhu, Jing [1 ]
Yu, Yibiao [1 ]
机构
[1] Soochow Univ, Sch Elect & Informat Engn, Suzhou, Peoples R China
关键词
speech synthesis; intonation; prosody; polynomial fitting;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Expressive speech synthesis has a wide variety of applications. Compared with general speech synthesis for Chinese, this paper focuses on prosody and intonation. Prosody is described from three aspects, accent, pause and speaking speed. Accent can be stressed by modifying fundamental frequency and amplitude. Pause is achieved by interpolating some frames which parameter value is zero. Speaking speed is controlled by copying or deleting some frames in specified location. Mandarin is a tonal language, so intonation is significant in the synthesis. There are four patterns of intonation, rising intonation, falling intonation, flat intonation and sinuate intonation. Use polynomial fitting function to model each intonation pattern. Apply the intonation model to convert one pattern to another. It can be seen from the experimental results, the proposed method can achieve a good quality on the conversion of tune and it can highly improve the naturalness of the speech.
引用
收藏
页码:549 / 552
页数:4
相关论文
共 50 条
  • [21] Tuning Intonation with Pitch Accent Decomposition for HMM-based Expressive Speech Synthesis
    Ni, Jinfu
    Shiga, Yoshinori
    Hori, Chiori
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [22] Prosody for Mandarin Speech Recognition: a Comparative Study of Read and Spontaneous Speech
    Yeung, Yu Ting
    Qian, Yao
    Lee, Tan
    Soong, Frank K.
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1133 - +
  • [23] Using prosody to improve Mandarin automatic speech recognition
    Ni, Chong-Jia
    Liu, Wen-Ju
    Xu, Bo
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2698 - 2701
  • [24] Unsupervised joint prosody labeling and modeling for Mandarin speech
    Chiang, Chen-Yu
    Chen, Sin-Horng
    Yu, Hsiu-Min
    Wang, Yih-Ru
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (02): : 1164 - 1183
  • [25] Multi-level Prosody and Spectrum Conversion for Emotional Speech Synthesis
    Wang, Zexun
    Yu, Yibiao
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 588 - 593
  • [26] TOWARDS EXPRESSIVE SPEAKING STYLE MODELLING WITH HIERARCHICAL CONTEXT INFORMATION FOR MANDARIN SPEECH SYNTHESIS
    Lei, Shun
    Zhou, Yixuan
    Chen, Liyang
    Wu, Zhiyong
    Kang, Shiyin
    Meng, Helen
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7922 - 7926
  • [27] Prosody conversion from neutral speech to emotional speech
    Tao, Jianhua
    Kang, Yongguo
    Li, Aijun
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (04): : 1145 - 1154
  • [28] ENRICHING MANDARIN SPEECH RECOGNITION BY INCORPORATING A HIERARCHICAL PROSODY MODEL
    Yang, Jyh-Her
    Liu, Ming-Chieh
    Chang, Hao-Hsiang
    Chiang, Chen-Yu
    Wang, Yih-Ru
    Chen, Sin-Horng
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5052 - 5055
  • [29] Consistency analysis of the spectrum and prosody within a syllable for Mandarin speech
    Chen, Kuan-Lin
    Yeh, Cheng-Yu
    Hwang, Shaw-Hwa
    Yan, Long-Jhe
    [J]. MATHEMATICAL METHODS IN THE APPLIED SCIENCES, 2013, 36 (14) : 1851 - 1861
  • [30] Perceptual Relevance of Pitch Contours of Mandarin Tones and its Efficacy in Prosody Generation of Speech Synthesis
    Chen, Shi-Han
    Kuo, Chih-Chung
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2792 - 2795