An efficient Mandarin text-to-speech system on time domain

被引:0
|
作者
Lin, YJ [1 ]
Yu, MS [1 ]
机构
[1] Natl Chung Hsing Univ, Dept Appl Math, Taichung 40227, Taiwan
来源
关键词
text-to-speech; intelligibility; comprehensibility; naturalness;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes a complete Mandarin text-to-speech system on time domain. We take advantage of the advancement of memory technology, which achieves ever-increasing capacity and ever-lower price. We try to collect as more as possi ble the synthesis units in a Mandarin text-to-speech system. With such an effort, we developed simpler speech processing techniques and achieved faster processing speed by using only an ordinary personal computer. We also developed delicate methods to measure the intelligibility, comprehensibility, and naturalness of a Mandarin text-to-speech system. Our system performs very well compared with existing systems. We first develop a set of useful algorithms and methods to deal with some features of the syllables, such as duration, amplitude, fundamental frequency, pause, and so on. Based on these algorithms and methods, we then build a Mandarin text-to-speech system. Given any Chinese text in some computerized form, e.g., in BIG-5 code representation, our system can pronounce the text in real time. Our text-to-speech system runs on an IBM 80486 compatible PC, with no special hardware for signal processing. The evaluation of our text-to-speech system is based on a proposed subjective evaluation method. An evaluation was made by 51 undergraduate students. The intelligibility of our text-to-speech system is 99.5%, the comprehensibility of our text-to-speech system is 92.6%, and the naturalness of our text-to-speech system is 81.512 points in a percentile grading system (the highest score is 100 points, and the lowest score is 0 point). Other 40 Ph.D. students also did the same evaluation about naturalness. The result shows that the naturalness of our text-to-speech system is 82.8 points in a percentile grading system.
引用
收藏
页码:545 / 555
页数:11
相关论文
共 50 条
  • [1] A Mandarin text-to-speech system
    Hwang, SH
    Chen, SH
    Wang, YR
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1421 - 1424
  • [2] Text normalization in mandarin Text-to-Speech system
    Jia, Yuxiang
    Huang, Dezhi
    Liu, Wu
    Dong, Yuan
    Yu, Shiwen
    Wang, Haila
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4693 - +
  • [3] The pause duration prediction for mandarin text-to-speech system
    Yu, J
    Tao, JH
    [J]. Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 204 - 208
  • [4] A Prosodic Mandarin Text-to-Speech System Based on Tacotron
    Zhang, Chuxiong
    Zhang, Sheng
    Zhong, Haibing
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 165 - 169
  • [5] Pitch models of Mandarin text-to-speech
    邵艳秋
    穗志方
    韩纪庆
    [J]. Journal of Harbin Institute of Technology., 2009, 16 (02) - 184
  • [6] Pitch models of Mandarin text-to-speech
    邵艳秋
    穗志方
    韩纪庆
    [J]. Journal of Harbin Institute of Technology(New series), 2009, 16 (02) : 179 - 184
  • [7] An HMM-based Mandarin Chinese Text-to-Speech system
    Qian, Yao
    Soong, Frank
    Chen, Yining
    Chu, Min
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 223 - +
  • [8] An efficient text analyzer with prosody generator-driven approach for mandarin text-to-speech
    Hwang, SH
    Yeh, CY
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 488 - 491
  • [9] Efficient text analyser with prosody generator-driven approach for Mandarin text-to-speech
    Yeh, CY
    Hwang, SH
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2005, 152 (06): : 793 - 799
  • [10] Hierarchical Stress Modeling in Mandarin Text-to-Speech
    Li, Ya
    Tao, Jianhua
    Xu, Xiaoying
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2024 - +