A mandarin text-to-speech technique implemented on a PIC-based microcontroller platform

被引：0

作者：

Yeh, Cheng-Yu ^{[1
]}

Chang, Chih-Hsuan ^{[1
]}

机构：

[1] Natl Chin Yi Univ Technol, Dept Elect Engn, Taichung 41170, Taiwan

来源：

IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING | 2016年 / 11卷

关键词：

text-to-speech; real-time embedded system; microcontroller; recurrent neural network; pitch-synchronous overlap-add;

D O I：

10.1002/tee.22327

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, a Mandarin text-to-speech (TTS) technique is employed to achieve the implementation of a voiced E-book on the PIC-based embedded platform. A transformation from the text of E-book to the corresponding speech can help blind users and make the reading more effortless and relaxed. Both the microcontroller with a PIC32 Ethernet Starter Kit (80 MHz, 32-bit, 128 kB SRAM, 512 kB Flash) and the Multimedia Expansion Board designed by Microchip Technology Inc. are adopted as the embedded platform. Four subsystems, namely text analysis, a recurrent neural network-based prosodic generator, a synthesis unit generator with 411 Chinese syllabic waveforms, and a pitch-synchronous overlap-add-based speech synthesizer, are made in the Mandarin TTS system and are implemented with C programming language. Experimental results find that a system requirement of 1.66 MB storage memory and less than 25.4 kB runtime memory, as well as 21.3% CPU runtime, is sufficient for real-time operation such that a natural and fluent speech with a 16-bit PCM at 8 kHz sampling rate is provided. The performance of the PIC-based Mandarin TTS system is demonstrated to be good. (c) 2016 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

引用

页码：S60 / S64

页数：5

共 50 条

[11] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
不详
Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
[12] A consistency analysis on an acoustic module for Mandarin text-to-speech
Yeh, Cheng-Yu
Chang, Shun-Chieh
Hwang, Shaw-Hwa
SPEECH COMMUNICATION, 2013, 55 (02) : 266 - 277
[13] Refining Unit Boundaries for Mandarin Text-to-Speech Database
Dong, Minghui
Cen, Ling
Chan, Paul
Li, Haizhou
2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 245 - 248
[14] The pause duration prediction for mandarin text-to-speech system
Yu, J
Tao, JH
Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 204 - 208
[15] An efficient Mandarin text-to-speech system on time domain
Lin, YJ
Yu, MS
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1998, E81D (06): : 545 - 555
[16] An enhanced text analysis approach in text-to-speech synthesis for mandarin chinese
Jiang, Wei
Wang, Xiao-Long
Guan, Yi
Pang, Xiu-Li
ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 5, PROCEEDINGS, 2007, : 410 - +
[17] Text-To-Speech based dictation platform for students with learning difficulties
Oumaima, Zine
Abdelouafi, Meziane
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA'18), 2018,
[18] A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
Chou, FC
Tseng, CY
Lee, LS
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07): : 481 - 494
[19] Integrating coding techniques into LP-based Mandarin text-to-speech synthesis
Hu H.-T.
Wang H.-M.
Int J Speech Technol, 2007, 1 (31-44): : 31 - 44
[20] Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech
Li, Ya
Tao, Jianhua
Hirose, Keikichi
Xu, Xiaoying
Lai, Wei
SPEECH COMMUNICATION, 2015, 72 : 59 - 73

← 1 2 3 4 5 →