A mandarin text-to-speech technique implemented on a PIC-based microcontroller platform

被引:0
|
作者
Yeh, Cheng-Yu [1 ]
Chang, Chih-Hsuan [1 ]
机构
[1] Natl Chin Yi Univ Technol, Dept Elect Engn, Taichung 41170, Taiwan
关键词
text-to-speech; real-time embedded system; microcontroller; recurrent neural network; pitch-synchronous overlap-add;
D O I
10.1002/tee.22327
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a Mandarin text-to-speech (TTS) technique is employed to achieve the implementation of a voiced E-book on the PIC-based embedded platform. A transformation from the text of E-book to the corresponding speech can help blind users and make the reading more effortless and relaxed. Both the microcontroller with a PIC32 Ethernet Starter Kit (80 MHz, 32-bit, 128 kB SRAM, 512 kB Flash) and the Multimedia Expansion Board designed by Microchip Technology Inc. are adopted as the embedded platform. Four subsystems, namely text analysis, a recurrent neural network-based prosodic generator, a synthesis unit generator with 411 Chinese syllabic waveforms, and a pitch-synchronous overlap-add-based speech synthesizer, are made in the Mandarin TTS system and are implemented with C programming language. Experimental results find that a system requirement of 1.66 MB storage memory and less than 25.4 kB runtime memory, as well as 21.3% CPU runtime, is sufficient for real-time operation such that a natural and fluent speech with a 16-bit PCM at 8 kHz sampling rate is provided. The performance of the PIC-based Mandarin TTS system is demonstrated to be good. (c) 2016 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
引用
收藏
页码:S60 / S64
页数:5
相关论文
共 50 条
  • [11] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
    EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
    不详
    Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
  • [12] A consistency analysis on an acoustic module for Mandarin text-to-speech
    Yeh, Cheng-Yu
    Chang, Shun-Chieh
    Hwang, Shaw-Hwa
    SPEECH COMMUNICATION, 2013, 55 (02) : 266 - 277
  • [13] Refining Unit Boundaries for Mandarin Text-to-Speech Database
    Dong, Minghui
    Cen, Ling
    Chan, Paul
    Li, Haizhou
    2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 245 - 248
  • [14] The pause duration prediction for mandarin text-to-speech system
    Yu, J
    Tao, JH
    Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 204 - 208
  • [15] An efficient Mandarin text-to-speech system on time domain
    Lin, YJ
    Yu, MS
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1998, E81D (06): : 545 - 555
  • [16] An enhanced text analysis approach in text-to-speech synthesis for mandarin chinese
    Jiang, Wei
    Wang, Xiao-Long
    Guan, Yi
    Pang, Xiu-Li
    ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 5, PROCEEDINGS, 2007, : 410 - +
  • [17] Text-To-Speech based dictation platform for students with learning difficulties
    Oumaima, Zine
    Abdelouafi, Meziane
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA'18), 2018,
  • [18] A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
    Chou, FC
    Tseng, CY
    Lee, LS
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07): : 481 - 494
  • [19] Integrating coding techniques into LP-based Mandarin text-to-speech synthesis
    Hu H.-T.
    Wang H.-M.
    Int J Speech Technol, 2007, 1 (31-44): : 31 - 44
  • [20] Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech
    Li, Ya
    Tao, Jianhua
    Hirose, Keikichi
    Xu, Xiaoying
    Lai, Wei
    SPEECH COMMUNICATION, 2015, 72 : 59 - 73