Adaptive End-to-End Text-to-Speech Synthesis Based on Error Correction Feedback from Humans

被引:0
|
作者
Fujii, Kazuki [1 ]
Saito, Yuki [1 ]
Saruwatari, Hiroshi [1 ]
机构
[1] Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo Bunkyo-ku, Tokyo,133-8656, Japan
关键词
Engineering Village;
D O I
2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
中图分类号
学科分类号
摘要
Correct error - Embeddings - End to end - Errors correction - Human listeners - Human-in-the-loop - State of the art - Synthetic speech - Text to speech - Text-to-speech system
引用
收藏
页码:1702 / 1707
相关论文
共 50 条
  • [31] A Novel End-to-End Turkish Text-to-Speech (TTS) System via Deep Learning
    Oyucu, Saadin
    ELECTRONICS, 2023, 12 (08)
  • [32] Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech
    Chung, Hyunseung
    Lee, Sang-Hoon
    Lee, Seong-Whan
    INTERSPEECH 2021, 2021, : 3635 - 3639
  • [33] PREDICTING EXPRESSIVE SPEAKING STYLE FROM TEXT IN END-TO-END SPEECH SYNTHESIS
    Stanton, Daisy
    Wang, Yuxuan
    Skerry-Ryan, R. J.
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 595 - 602
  • [34] Speaker Adaptation Experiments with Limited Data for End-to-End Text-To-Speech Synthesis using Tacotron2
    Mandeel, Ali Raheem
    Al-Radhi, Mohammed Salah
    Csapo, Tamas Gabor
    INFOCOMMUNICATIONS JOURNAL, 2022, 14 (03): : 55 - 62
  • [35] Boosting subjective quality of Arabic text-to-speech (TTS) using end-to-end deep architecture
    Fahmy, Fady K.
    Abbas, Hazem M.
    Khalil, Mahmoud, I
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (01) : 79 - 88
  • [36] You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
    Laptev, Aleksandr
    Korostik, Roman
    Svischev, Aleksey
    Andrusenko, Andrei
    Medennikov, Ivan
    Rybin, Sergey
    2020 13TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2020), 2020, : 439 - 444
  • [37] Boosting subjective quality of Arabic text-to-speech (TTS) using end-to-end deep architecture
    Fady K. Fahmy
    Hazem M. Abbas
    Mahmoud I. Khalil
    International Journal of Speech Technology, 2022, 25 : 79 - 88
  • [38] BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in A Text-to-Speech Front-End
    Zheng, Yibin
    Tao, Jianhua
    Wen, Zhengqi
    Li, Ya
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 47 - 51
  • [39] NIX-TTS: LIGHTWEIGHT AND END-TO-END TEXT-TO-SPEECH VIA MODULE-WISE DISTILLATION
    Chevi, Rendi
    Prasojo, Radityo Eko
    Aji, Alham Fikri
    Tjandra, Andros
    Sakti, Sakriani
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 970 - 976
  • [40] ESPNET-TTS: UNIFIED, REPRODUCIBLE, AND INTEGRATABLE OPEN SOURCE END-TO-END TEXT-TO-SPEECH TOOLKIT
    Hayashi, Tomoki
    Yamamoto, Ryuichi
    Inoue, Katsuki
    Yoshimura, Takenori
    Watanabe, Shinji
    Toda, Tomoki
    Takeda, Kazuya
    Zhang, Yu
    Tan, Xu
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7654 - 7658