Hybrid model method for automatic segmentation of mandarin TTS corpus

被引:0
|
作者
Yuan, Xiaoliang [1 ]
Dong, Yuan
Huang, Dezhi
Guo, Jun
Wang, Haila
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat Engn, Beijing 100876, Peoples R China
[2] France Telecom, R&D Beijing Co Ltd, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For a corpus-based Mandarin text-to-speech system, the quality of synthesized speech is highly affected by the accuracy of unit boundaries. In this paper, we proposed a hybrid model method for automatic segmentation of Mandarin text-to-speech corpus. The boundaries of acoustic units are categorized into eleven phonetic groups. For a given phonetic group of boundaries, the proposed method selects an appropriate model from initial-final monophone-based HMM, semi-syllable monophone-based HMM and initial-final triphone-based HMM. The experimental results show that the hybrid model method can achieve better performance than the single model method, in terms of error rate and time shift of boundaries.
引用
收藏
页码:906 / 912
页数:7
相关论文
共 50 条
  • [1] A PHONE SEGMENTATION METHOD AND ITS EVALUATION ON MANDARIN SPEECH CORPUS
    Hoang, Dac-Thang
    Wang, Hsiao-Chuan
    [J]. 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 373 - 377
  • [2] TOWARDS AUTOMATIC PHONETIC SEGMENTATION FOR TTS
    Rendel, Asaf
    Sorin, Alexander
    Hoory, Ron
    Breen, Andrew
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4533 - 4536
  • [3] AISHELL-3: A Multi-Speaker Mandarin TTS Corpus
    Shi, Yao
    Bu, Hui
    Xu, Xin
    Zhang, Shaoji
    Li, Ming
    [J]. INTERSPEECH 2021, 2021, : 2756 - 2760
  • [4] A Mandarin TTS system with an integrated prosodic model
    Pin, SH
    Lee, YL
    Chen, YC
    Wang, HM
    Tseng, CY
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 169 - 172
  • [5] Automatic Phonetic Segmentation by Using a SPM-based Approach for a Mandarin Singing Voice Corpus
    Lin, Cheng-Yuan
    Jang, J-S. Roger
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2294 - 2297
  • [6] Automatic speech segmentation with the application of the Czech TTS system
    Horák, P
    Hesounová, B
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 201 - 206
  • [7] Comparative study of automatic phone segmentation methods for TTS
    Adell, J
    Bonafonte, A
    Gómez, JA
    Castro, MJ
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 309 - 312
  • [8] Automatic phonetic segmentation by score predictive model for the corpora of mandarin singing voices
    Lin, Cheng-Yuan
    Jang, Jyh-Shing Roger
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 2151 - 2159
  • [9] A Hybrid Method for Automatic Anatomical Variant Detection and Segmentation
    Hanna, Raghed
    Barschdorf, Hans
    Klinder, Tobias
    Weber, Frank M.
    Krueger, Martin W.
    Doessel, Olaf
    Lorenz, Cristian
    [J]. FUNCTIONAL IMAGING AND MODELING OF THE HEART, 2011, 6666 : 333 - 340
  • [10] A hybrid clustering method for automatic medical image segmentation
    Chi, Dongxiang
    Cheng, Weizhong
    [J]. Journal of Computational Information Systems, 2010, 6 (06): : 1983 - 1993