End-to-end Speech Synthesis for Tibetan Lhasa Dialect

被引:2
|
作者
Luo, Lisai [1 ]
Li, Guanyu [1 ]
Gong, Chunwei [1 ]
Ding, Hailan [1 ]
机构
[1] Northwest Minzu Univ, Key Lab Natl Language Intelligent Proc Gansu Prov, Lanzhou, Gansu, Peoples R China
关键词
D O I
10.1088/1742-6596/1187/5/052061
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech synthesis for Tibetan Lhasa dialect is implemented on the basis of an end-toend novel speech synthesis framework, Tacotron. The training transcript has used the phoneme list transcribed from Tibetan characters, and feature parameters were extracted from the mel-spectrogram. Then the model is trained by the mapping of character to spectrum. Tibetan language is an important minority language of the Chinese nation, but there is little research on Tibetan language at present. The experimental results were compared with traditional speech synthesis methods, with the audio quality significantly better than that of the traditional GMM-HMM in both naturalness and rhythm. It provides a crucial reference for the later research methods of Tibetan language and promotes the development of Tibetan language research.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] END-TO-END MULTIMODAL SPEECH RECOGNITION
    Palaskar, Shruti
    Sanabria, Ramon
    Metze, Florian
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5774 - 5778
  • [32] End-to-End Speech Recognition in Russian
    Markovnikov, Nikita
    Kipyatkova, Irina
    Lyakso, Elena
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 377 - 386
  • [33] Overview of end-to-end speech recognition
    Wang, Song
    Li, Guanyu
    [J]. 2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [34] END-TO-END ANCHORED SPEECH RECOGNITION
    Wang, Yiming
    Fan, Xing
    Chen, I-Fan
    Liu, Yuzong
    Chen, Tongfei
    Hoffmeister, Bjorn
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7090 - 7094
  • [35] An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model
    Wang, Ding
    Ye, Shuaishuai
    Hu, Xinhui
    Li, Sheng
    Xu, Xinkang
    [J]. INTERSPEECH 2021, 2021, : 3266 - 3270
  • [36] MULTILINGUAL END-TO-END SPEECH TRANSLATION
    Inaguma, Hirofumi
    Duh, Kevin
    Kawahara, Tatsuya
    Watanabe, Shinji
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 570 - 577
  • [37] End-to-End Speech Translation for Code Switched Speech
    Weller, Orion
    Sperber, Matthias
    Pires, Telmo
    Setiawan, Hendra
    Gollan, Christian
    Telaar, Dominic
    Paulik, Matthias
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1435 - 1448
  • [38] EXPLORING END-TO-END NEURAL TEXT-TO-SPEECH SYNTHESIS FOR ROMANIAN
    Dumitrache, Marius
    Rebedea, Traian
    [J]. PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE LINGUISTIC RESOURCES AND TOOLS FOR NATURAL LANGUAGE PROCESSING, 2020, : 93 - 102
  • [39] Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis
    Yang, Fengyu
    Yang, Shan
    Wu, Qinghua
    Wang, Yujun
    Xie, Lei
    [J]. INTERSPEECH 2020, 2020, : 3436 - 3440
  • [40] Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
    Skerry-Ryan, R. J.
    Battenberg, Eric
    Xiao, Ying
    Wang, Yuxuan
    Stanton, Daisy
    Shor, Joel
    Weiss, Ron J.
    Clark, Rob
    Saurous, Rif A.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80