Multi-level Exemplar-Based Duration Generation for Expressive Speech Synthesis

被引:0
|
作者
Abou-Zleikha, Mohamed [1 ]
Szekely, Eva [1 ]
Cahill, Peter [1 ]
Carson-Berndsen, Julie [1 ]
机构
[1] Univ Coll Dublin, Sch Informat & Comp Sci, CNGL, Dublin 2, Ireland
关键词
speech prosody; duration generation; exemplar-based model;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The generation of duration of speech units from linguistic information, as one component of a prosody model, is considered to be a requirement for natural sounding speech synthesis. This paper investigates the use of a multi-level exemplar-based model for duration generation for the purposes of expressive speech synthesis. The multi-level exemplar-based model has been proposed in the literature as a cognitive model for the production of duration. The implementation of this model for duration generation for speech synthesis is not straight forward and requires a set of modifications to the model and that the linguistically related units and the context of the target units should be taken into consideration. The work presented in this paper implements this model and presents a solution to these issues through the use of prosodic-syntactic correlated data, full context information of the input example and corpus exemplars.
引用
收藏
页码:59 / 62
页数:4
相关论文
共 50 条
  • [1] Exemplar-based speech waveform generation
    Watts, Oliver
    Valentini-Botinhao, Cassia
    Espic, Felipe
    King, Simon
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2022 - 2026
  • [2] Exemplar-Based Emotive Speech Synthesis
    Wu, Xixin
    Cao, Yuewen
    Lu, Hui
    Liu, Songxiang
    Kang, Shiyin
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 874 - 886
  • [3] INTERACTIVE MULTI-LEVEL PROSODY CONTROL FOR EXPRESSIVE SPEECH SYNTHESIS
    Cornille, Tobias
    Wang, Fengna
    Bekker, Jessa
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8312 - 8316
  • [4] Exemplar-Based Processing for Speech Recognition
    Sainath, Tara N.
    Ramabhadran, Bhuvana
    Nahamoo, David
    Kanevsky, Dimitri
    Van Compernolle, Dirk
    Demuynck, Kris
    Gemmeke, Jort Florent
    Bellegarda, Jerome R.
    Sundaram, Shiva
    IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 98 - 113
  • [5] Duration Prediction Using Multi-Level Model for GPR-Based Speech Synthesis
    Moungsri, Decha
    Koriyama, Tomoki
    Kobayashi, Takao
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1591 - 1595
  • [6] GPR-based Thai speech synthesis using multi-level duration prediction
    Moungsri, Decha
    Koriyama, Tomoki
    Kobayashi, Takao
    SPEECH COMMUNICATION, 2018, 99 : 114 - 123
  • [7] A survey of exemplar-based texture synthesis
    Raad, Lara
    Davy, Axel
    Desolneux, Agnes
    Morel, Jean-Michel
    ANNALS OF MATHEMATICAL SCIENCES AND APPLICATIONS, 2018, 3 (01) : 89 - 148
  • [8] Text Generation with Exemplar-based Adaptive Decoding
    Peng, Hao
    Parikh, Ankur P.
    Faruqui, Manaal
    Dhingra, Bhuwan
    Das, Dipanjan
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2555 - 2565
  • [9] COUPLED DICTIONARY TRAINING FOR EXEMPLAR-BASED SPEECH ENHANCEMENT
    Baby, Deepak
    Virtanen, Tuomas
    Barker, Tom
    Van Hamme, Hugo
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] Exemplar-based Stylized Gesture Generation from Speech: An Entry to the GENEA Challenge 2022
    Ghorbani, Saeed
    Ferstl, Ylva
    Carbonneau, Marc-Andre
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 778 - 783