Modeling Vietnamese Speech Prosody: A Step-by-Step Approach Towards an Expressive Speech Synthesis System

被引:0
|
作者
Mac, Dang-Khoa [1 ]
Tran, Do-Dat [1 ]
机构
[1] Int Res Inst MICA, HUST CNRS UMI Grenoble INP 2954, Hanoi, Vietnam
关键词
Text-to-speech; Vietnamese; Prosody modeling; Tones; Phrasing; Attitude; Expressive speech;
D O I
10.1007/978-3-319-25660-3_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attempts to add expressivity to synthesized speech is one of the main strategies in speech technologies. This paper summarizes our researches on modeling Vietnamese prosody, with the goal of improving naturalness of synthesized speech in Vietnamese, as well as integrating expressivities (i.e. emotion/attitude). Based on the concept of "rendez-vous" between linguistic levels and prosodic functions, the prosody of utterance is proposed to be decomposed into several components. Therefore, each component is step by step modeled by an independent model: a dynamic linear segment model for tones, a relative registers model for F0 level of syllable, a rule-based approach for phrasing modeling and a F0 stylization modeling for the expressive function. All proposed models were integrated in speech Text-to-speech systems and also were evaluated by perception experiments.
引用
收藏
页码:273 / 287
页数:15
相关论文
共 50 条
  • [31] Nano Focus: Step-by-step synthesis approach leads to complex hybrid nanoparticles
    Alia P. Schoen
    [J]. MRS Bulletin, 2012, 37 : 6 - 7
  • [32] APPROACH TOWARDS A SYNTHESIS-BASED SPEECH RECOGNITION SYSTEM
    THOSAR, RB
    RAO, PVS
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1976, 24 (02): : 194 - 196
  • [33] Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder
    Akuzawa, Kei
    Iwasawa, Yusuke
    Matsuo, Yutaka
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3067 - 3071
  • [34] An HMM-based Vietnamese Speech Synthesis System
    Vu, Thang Tat
    Luong, Mai Chi
    Nakamura, Satoshi
    [J]. ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 116 - +
  • [35] Alternative Vietnamese Speech Synthesis System with Phoneme Structure
    Quang Tuong Lam
    Duc Hao Do
    Thanh Hung Vo
    Duc Dung Nguyen
    [J]. ISCIT 2019: PROCEEDINGS OF 2019 19TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2019, : 64 - 69
  • [36] Evaluation of Prosody in Text-to-Speech Synthesis System of Bangla
    Basu, Tulika
    Saha, Arup
    [J]. 2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [37] Towards Multi-Scale Style Control for Expressive Speech Synthesis
    Li, Xiang
    Song, Changhe
    Li, Jingbei
    Wu, Zhiyong
    Jia, Jia
    Meng, Helen
    [J]. INTERSPEECH 2021, 2021, : 4673 - 4677
  • [38] ACCENT GROUP MODELING FOR IMPROVED PROSODY IN STATISTICAL PARAMETERIC SPEECH SYNTHESIS
    Anumanchipalli, Gopala Krishna
    Oliveira, Luis C.
    Black, Alan W.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6890 - 6894
  • [39] Prosody-TTS: An End-to-End Speech Synthesis System with Prosody Control
    Pamisetty, Giridhar
    Murty, K. Sri Rama
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (01) : 361 - 384
  • [40] Prosody-TTS: An End-to-End Speech Synthesis System with Prosody Control
    Giridhar Pamisetty
    K. Sri Rama Murty
    [J]. Circuits, Systems, and Signal Processing, 2023, 42 : 361 - 384