End-to-end Speech Synthesis for Tibetan Lhasa Dialect

被引：2

作者：

Luo, Lisai ^{[1
]}

Li, Guanyu ^{[1
]}

Gong, Chunwei ^{[1
]}

Ding, Hailan ^{[1
]}

机构：

[1] Northwest Minzu Univ, Key Lab Natl Language Intelligent Proc Gansu Prov, Lanzhou, Gansu, Peoples R China

来源：

2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018) | 2019年 / 1187卷

关键词：

D O I：

10.1088/1742-6596/1187/5/052061

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Speech synthesis for Tibetan Lhasa dialect is implemented on the basis of an end-toend novel speech synthesis framework, Tacotron. The training transcript has used the phoneme list transcribed from Tibetan characters, and feature parameters were extracted from the mel-spectrogram. Then the model is trained by the mapping of character to spectrum. Tibetan language is an important minority language of the Chinese nation, but there is little research on Tibetan language at present. The experimental results were compared with traditional speech synthesis methods, with the audio quality significantly better than that of the traditional GMM-HMM in both naturalness and rhythm. It provides a crucial reference for the later research methods of Tibetan language and promotes the development of Tibetan language research.

引用

页数：6

共 50 条

[31] END-TO-END MULTIMODAL SPEECH RECOGNITION
Palaskar, Shruti
Sanabria, Ramon
Metze, Florian
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5774 - 5778
[32] End-to-End Speech Recognition in Russian
Markovnikov, Nikita
Kipyatkova, Irina
Lyakso, Elena
[J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 377 - 386
[33] Overview of end-to-end speech recognition
Wang, Song
Li, Guanyu
[J]. 2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
[34] END-TO-END ANCHORED SPEECH RECOGNITION
Wang, Yiming
Fan, Xing
Chen, I-Fan
Liu, Yuzong
Chen, Tongfei
Hoffmeister, Bjorn
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7090 - 7094
[35] An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model
Wang, Ding
Ye, Shuaishuai
Hu, Xinhui
Li, Sheng
Xu, Xinkang
[J]. INTERSPEECH 2021, 2021, : 3266 - 3270
[36] MULTILINGUAL END-TO-END SPEECH TRANSLATION
Inaguma, Hirofumi
Duh, Kevin
Kawahara, Tatsuya
Watanabe, Shinji
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 570 - 577
[37] End-to-End Speech Translation for Code Switched Speech
Weller, Orion
Sperber, Matthias
Pires, Telmo
Setiawan, Hendra
Gollan, Christian
Telaar, Dominic
Paulik, Matthias
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1435 - 1448
[38] EXPLORING END-TO-END NEURAL TEXT-TO-SPEECH SYNTHESIS FOR ROMANIAN
Dumitrache, Marius
Rebedea, Traian
[J]. PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE LINGUISTIC RESOURCES AND TOOLS FOR NATURAL LANGUAGE PROCESSING, 2020, : 93 - 102
[39] Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis
Yang, Fengyu
Yang, Shan
Wu, Qinghua
Wang, Yujun
Xie, Lei
[J]. INTERSPEECH 2020, 2020, : 3436 - 3440
[40] Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
Skerry-Ryan, R. J.
Battenberg, Eric
Xiao, Ying
Wang, Yuxuan
Stanton, Daisy
Shor, Joel
Weiss, Ron J.
Clark, Rob
Saurous, Rif A.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80

← 1 2 3 4 5 →