Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

被引:0
|
作者
Barakat, Huda [1 ]
Turk, Oytun [2 ]
Demiroglu, Cenk [2 ]
机构
[1] Ozyegin Univ, Dept Comp Sci, TR-34794 Istanbul, Turkiye
[2] Ozyegin Univ, Dept Elect & Elect Engn, TR-34794 Istanbul, Turkiye
关键词
Speech synthesis; Expressive speech; Emotional speech; Deep learning; EMOTIONAL EXPRESSIONS; STYLE; TEXT; MODEL; REPRESENTATIONS; NETWORK; QUALITY;
D O I
10.1186/s13636-024-00329-7
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech synthesis has made significant strides thanks to the transition from machine learning to deep learning models. Contemporary text-to-speech (TTS) models possess the capability to generate speech of exceptionally high quality, closely mimicking human speech. Nevertheless, given the wide array of applications now employing TTS models, mere high-quality speech generation is no longer sufficient. Present-day TTS models must also excel at producing expressive speech that can convey various speaking styles and emotions, akin to human speech. Consequently, researchers have concentrated their efforts on developing more efficient models for expressive speech synthesis in recent years. This paper presents a systematic review of the literature on expressive speech synthesis models published within the last 5 years, with a particular emphasis on approaches based on deep learning. We offer a comprehensive classification scheme for these models and provide concise descriptions of models falling into each category. Additionally, we summarize the principal challenges encountered in this research domain and outline the strategies employed to tackle these challenges as documented in the literature. In the Section 8, we pinpoint some research gaps in this field that necessitate further exploration. Our objective with this work is to give an all-encompassing overview of this hot research area to offer guidance to interested researchers and future endeavors in this field.
引用
收藏
页数:34
相关论文
共 50 条
  • [1] Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources
    Huda Barakat
    Oytun Turk
    Cenk Demiroglu
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2024
  • [2] Machine and Deep Learning-based XSS Detection Approaches: A Systematic Literature Review
    Thajeel, Isam Kareem
    Samsudin, Khairulmizam
    Hashim, Shaiful Jahari
    Hashim, Fazirulhisyam
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (07)
  • [3] Part of speech tagging: a systematic review of deep learning and machine learning approaches
    Alebachew Chiche
    Betselot Yitagesu
    [J]. Journal of Big Data, 9
  • [4] Part of speech tagging: a systematic review of deep learning and machine learning approaches
    Chiche, Alebachew
    Yitagesu, Betselot
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [5] A Review of Deep Learning Based Speech Synthesis
    Ning, Yishuang
    He, Sheng
    Wu, Zhiyong
    Xing, Chunxiao
    Zhang, Liang-Jie
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (19):
  • [6] Deep Learning-Based Approaches for Oil Spill Detection: A Bibliometric Review of Research Trends and Challenges
    Vasconcelos, Rodrigo N.
    Lima, Andre T. Cunha
    Lentini, Carlos A. D.
    Miranda, Jose Garcia V.
    de Mendonca, Luis F. F.
    Lopes, Jose M.
    Santana, Mariana M. M.
    Cambui, Elaine C. B.
    Souza, Deorgia T. M.
    Costa, Diego P.
    Duverger, Soltan G.
    Franca-Rocha, Washington S.
    [J]. JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (07)
  • [7] Deep learning-based electroencephalography analysis: a systematic review
    Roy, Yannick
    Banville, Hubert
    Albuquerque, Isabela
    Gramfort, Alexandre
    Falk, Tiago H.
    Faubert, Jocelyn
    [J]. JOURNAL OF NEURAL ENGINEERING, 2019, 16 (05)
  • [8] Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review
    Lee, Wookey
    Seong, Jessica Jiwon
    Ozlu, Busra
    Shim, Bong Sup
    Marakhimov, Azizbek
    Lee, Suan
    [J]. SENSORS, 2021, 21 (04) : 1 - 22
  • [9] A Systematic Review of Different Categories of Plant Disease Detection Using Deep Learning-Based Approaches
    Yogesh Kumar
    Rupinder Singh
    Manu Raj Moudgil
    [J]. Archives of Computational Methods in Engineering, 2023, 30 : 4757 - 4779
  • [10] Taxonomy of deep learning-based intrusion detection system approaches in fog computing: a systematic review
    Najafli, Sepide
    Haghighat, Abolrazl Toroghi
    Karasfi, Babak
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (11) : 6527 - 6560