The paradigm for creating multi-lingual text-to-speech voice databases

被引:0
|
作者
Chu, Min [1 ]
Zhao, Yong [1 ]
Chen, Yining [1 ]
Wang, Lijuan [1 ]
Soong, Frank [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
multi-lingual; text-to-speech; voice database;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice database is one of the most important parts in TTS systems. However, creating a high quality new TTS voice is not an easy task even for a professional team, The whole process is rather complicated and contains plenty minutiae that should be handled carefully. In fact, in many stages, human interference such as manually checking or labeling is necessary. In multi-lingual situations, it is more challenge to find qualified people to do this kind of interference. That's why most state-of-the-art TTS systems can provide only a few voices. In this paper, we outline a uniform paradigm for creating multi-lingual TTS voice databases. It focuses on technologies that can either improve the scalability of data collection or reduce human interference such as manually checking or labeling. With this paradigm, we decrease the complexity and work load of the task.
引用
收藏
页码:736 / +
页数:3
相关论文
共 50 条
  • [21] Design consideration for multi-lingual cascading text compressors
    Chi, CH
    Zhang, Y
    DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, : 520 - 520
  • [22] NimbleMiner: A Novel Multi-Lingual Text Mining Application
    Topaz, Maxim
    MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 1608 - 1609
  • [23] Input Text Repairing for Multi-lingual Chat System
    Yoshida, Kenichi
    Hattori, Fumio
    HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION: INFORMATION AND INTERACTION, PT II, 2009, 5618 : 210 - 217
  • [24] Exploiting multi-lingual text potentialities in EBMT systems
    Mandreoli, F
    Martoglia, R
    Tiberio, P
    RIDE - MLIM 2003: THIRTEENTH INTERNATIONAL WORK SHOP ON RESEARCH ISSUES IN DATA ENGINEERING: MULTI-LINGUAL INFORMATION MANAGEMENT, PROCEEDINGS, 2003, : 9 - 15
  • [25] Multi-Lingual Text Recognition from Video Frames
    Sharma, Nabin
    Mandal, Ranju
    Sharma, Rabi
    Roy, Partha P.
    Pal, Umapada
    Blumenstein, Michael
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 951 - 955
  • [26] Multi-lingual scene text detection and language identification
    Saha, Shaswata
    Chakraborty, Neelotpal
    Kundu, Soumyadeep
    Paul, Sayantan
    Mollah, Ayatullah Faruk
    Basu, Subhadip
    Sarkar, Ram
    PATTERN RECOGNITION LETTERS, 2020, 138 : 16 - 22
  • [27] JS']JSPEECH: A MULTI-LINGUAL CONVERSATIONAL SPEECH CORPUS
    Choobbasti, Ali Janalizadeh
    Gholamian, Mohammad Erfan
    Vaheb, Amir
    Safavi, Saeid
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 927 - 933
  • [28] Automatic segmentation and labelling of multi-lingual speech data
    Vorstermans, A
    Martens, JP
    VanCoile, B
    SPEECH COMMUNICATION, 1996, 19 (04) : 271 - 293
  • [29] Development of the "VoiceTra" Multi-Lingual Speech Translation System
    Matsuda, Shigeki
    Hayashi, Teruaki
    Ashikari, Yutaka
    Shiga, Yoshinori
    Kashioka, Hidenori
    Yasuda, Keiji
    Okuma, Hideo
    Uchiyama, Masao
    Sumita, Eiichiro
    Kawai, Hisashi
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (04): : 621 - 632
  • [30] SERAB: A MULTI-LINGUAL BENCHMARK FOR SPEECH EMOTION RECOGNITION
    Scheidwasser-Clow, Neil
    Kegler, Mikolaj
    Beckmann, Pierre
    Cernak, Milos
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7697 - 7701