The paradigm for creating multi-lingual text-to-speech voice databases

被引：0

作者：

Chu, Min ^{[1
]}

Zhao, Yong ^{[1
]}

Chen, Yining ^{[1
]}

Wang, Lijuan ^{[1
]}

Soong, Frank ^{[1
]}

机构：

[1] Microsoft Res Asia, Beijing, Peoples R China

来源：

CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS | 2006年 / 4274卷

关键词：

multi-lingual; text-to-speech; voice database;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Voice database is one of the most important parts in TTS systems. However, creating a high quality new TTS voice is not an easy task even for a professional team, The whole process is rather complicated and contains plenty minutiae that should be handled carefully. In fact, in many stages, human interference such as manually checking or labeling is necessary. In multi-lingual situations, it is more challenge to find qualified people to do this kind of interference. That's why most state-of-the-art TTS systems can provide only a few voices. In this paper, we outline a uniform paradigm for creating multi-lingual TTS voice databases. It focuses on technologies that can either improve the scalability of data collection or reduce human interference such as manually checking or labeling. With this paradigm, we decrease the complexity and work load of the task.

引用

页码：736 / +

页数：3

共 50 条

[1] Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech
Singh, Abhayjeet
Nagireddi, Amala
Jayakumar, Anjali
Deekshitha, G.
Bandekar, Jesuraja
Roopa, R.
Badiger, Sandhya
Udupa, Sathvik
Kumar, Saurabh
Ghosh, Prasanta Kumar
Murthy, Hema A.
Zen, Heiga
Kumar, Pranaw
Kant, Kamal
Bole, Amol
Singh, Bira Chandra
Tokuda, Keiichi
Hasegawa-Johnson, Mark
Olbrich, Philipp
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 790 - 798
[2] Multi-Lingual Multi-Speaker Text-to-Speech Synthesis for Voice Cloning with Online Speaker Enrollment
Liu, Zhaoyu
Mak, Brian
INTERSPEECH 2020, 2020, : 2932 - 2936
[3] Development of multi-lingual speech recognition and text-to-speech synthesis for automotive applications
Deguchi, Y
Kagoshima, T
Hirabayashi, G
Kanazawa, H
TELEMATCS FOR VEHICLES, 2002, 1728 : 233 - 240
[4] Development of multi-lingual speech recognition and text-to-speech synthesis for automotive applications
Deguchi, Y.
Kagoshima, T.
Hirabayashi, G.
Kanazawa, H.
VDI Berichte, 2002, (1728): : 233 - 240
[5] LIGHT-TTS: LIGHTWEIGHT MULTI-SPEAKER MULTI-LINGUAL TEXT-TO-SPEECH
Li, Song
Ouyang, Beibei
Li, Lin
Hong, Qingyang
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8383 - 8387
[6] A Controllable Multi-Lingual Multi-Speaker Multi-Style Text-to-Speech Synthesis With Multivariate Information Minimization
Cheon, Sung Jun
Choi, Byoung Jin
Kim, Minchan
Lee, Hyeonseung
Kim, Nam Soo
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 55 - 59
[7] Transfer Learning for Low-Resource, Multi-Lingual, and Zero-Shot Multi-Speaker Text-to-Speech
Jeong, Myeonghun
Kim, Minchan
Choi, Byoung Jin
Yoon, Jaesam
Jang, Won
Kim, Nam Soo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1519 - 1530
[8] Development of multi-lingual speech recognition and text-to speech synthesis for automotive applications
Deguchi, Y.
Kagoshima, T.
Hirabayashi, G.
Kanazawa, H.
Hogenhout, M.
VDI Berichte, 2003, (1789): : 3081 - 3088
[9] Development of multi-lingual speech recognition and text-to speech synthesis for automotive applications
Deguchi, Y
Kagoshima, T
Hirabayashi, G
Kanazawa, H
Hogenhout, M
ELECTRONIC SYSTEMS FOR VEHICLES, 2003, 1789 : 1167 - 1174
[10] Multi-lingual interoperability in speech technology
Steeneken, HJM
SPEECH COMMUNICATION, 2001, 35 (1-2) : 1 - 3

← 1 2 3 4 5 →