Differentiable Duration Refinement Using Internal Division for Non-Autoregressive Text-to-Speech

被引:0
|
作者
Lee, Jaeuk [1 ]
Shin, Yoonsoo [1 ]
Chang, Joon-Hyuk [1 ]
机构
[1] Hanyang University, School of Electronics, Seoul,04763, Korea, Republic of
来源
关键词
D O I
10.1109/LSP.2024.3495578
中图分类号
学科分类号
摘要
29
引用
下载
收藏
页码:3154 / 3158
相关论文
共 50 条
  • [1] Estonian Text-to-Speech Synthesis with Non-autoregressive Transformers
    Ratsep, Liisa
    Lellep, Rasmus
    Fishel, Mark
    BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 447 - 456
  • [2] LIGHTSPEECH: LIGHTWEIGHT NON-AUTOREGRESSIVE MULTI-SPEAKER TEXT-TO-SPEECH
    Li, Song
    Ouyang, Beibei
    Li, Lin
    Hong, Qingyang
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 499 - 506
  • [3] Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech
    Li, Yang
    Yu, Cheng
    Sun, Guangzhi
    Jiang, Hua
    Sun, Fanglei
    Zu, Weiqin
    Wen, Ying
    Yang, Yang
    Wang, Jun
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 391 - 400
  • [4] Exploring Timbre Disentanglement in Non-Autoregressive Cross-Lingual Text-to-Speech
    Zhan, Haoyue
    Yu, Xinyuan
    Zhang, Haitong
    Zhang, Yang
    Lin, Yue
    INTERSPEECH 2022, 2022, : 4247 - 4251
  • [5] Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-Speech
    Bae, Jae-Sung
    Yang, Jinhyeok
    Bak, Tae-Jun
    Joo, Young-Sun
    INTERSPEECH 2022, 2022, : 813 - 817
  • [6] VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis
    Lu, Hui
    Wu, Zhiyong
    Wu, Xixin
    Li, Xu
    Kang, Shiyin
    Liu, Xunying
    Meng, Helen
    INTERSPEECH 2021, 2021, : 3775 - 3779
  • [7] FCH-TTS: Fast, Controllable and High-quality Non-Autoregressive Text-to-Speech Synthesis
    Zhou, Xun
    Zhou, Zhiyang
    Shi, Xiaodong
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [8] MIXER-TTS: NON-AUTOREGRESSIVE, FAST AND COMPACT TEXT-TO-SPEECH MODEL CONDITIONED ON LANGUAGE MODEL EMBEDDINGS
    Tatanov, Oktai
    Beliaev, Stanislav
    Ginsburg, Boris
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7482 - 7486
  • [9] A COMPARATIVE STUDY ON NON-AUTOREGRESSIVE MODELINGS FOR SPEECH-TO-TEXT GENERATION
    Higuchi, Yosuke
    Chen, Nanxin
    Fujita, Yuya
    Inaguma, Hirofumi
    Komatsu, Tatsuya
    Lee, Jaesong
    Nozaki, Jumon
    Wang, Tianzi
    Watanabe, Shinji
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 47 - 54
  • [10] Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
    Elias, Isaac
    Zen, Heiga
    Shen, Jonathan
    Zhang, Yu
    Jia, Ye
    Skerry-Ryan, R. J.
    Wu, Yonghui
    INTERSPEECH 2021, 2021, : 141 - 145