Text normalization in mandarin Text-to-Speech system

被引：0

作者：

Jia, Yuxiang ^{[1
,2
]}

Huang, Dezhi ^{[2
]}

Liu, Wu ^{[2
]}

Dong, Yuan ^{[2
,3
]}

Yu, Shiwen ^{[1
]}

Wang, Haila ^{[2
]}

机构：

[1] Peking Univ, Inst Computat Linguist, Beijing 100871, Peoples R China

[2] France Telecom R&D Beijing, Speech & Nat Language Proc Unit, Beijing, Peoples R China

[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

来源：

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年

关键词：

Text-to-Speech (TTS); text normalization; finite state automata; maximum entropy classifier;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Text normalization is an important component in Text-to-Speech system and the difficulty in text normalization is to disambiguate the Non-Standard Words (NSWs). This paper develops a taxonomy of NSWs on the basis of a large scale Chinese corpus, and proposes a two-stage NSWs disambiguation strategy, Finite State Automata (FSA) for initial classification and Maximum Entropy (ME) classifiers for subclass disambiguation. Based on the above NSWs taxonomy, the two-stage approach achieves an F-score of 98.53% in open test, 5.23% higher than that of FSA based approach. Experiments show that the NSWs taxonomy ensures FSA a high baseline performance and ME classifiers make considerable improvement, and the two-stage approach adapts well to new domains.

引用

下载

页码：4693 / +

页数：2

共 50 条

[21] Text-to-speech system for Danish
1600, Publ by Elsevier Science Publishers B.V., Amsterdam, Neth
[22] Refining Unit Boundaries for Mandarin Text-to-Speech Database
Dong, Minghui
Cen, Ling
Chan, Paul
Li, Haizhou
2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 245 - 248
[23] Statistically Augmented Preprocessing/Normalization Module for a Romanian Text-to-Speech System
Ungurean, Catalin
Burileanu, Dragos
Surmei, Mihai
2013 7TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN - COMPUTER DIALOGUE (SPED), 2013,
[24] Bangla text normalization for text-to-speech synthesizer using machine learning algorithms
Islam, Md. Rezaul
Ahmad, Arif
Rahman, Mohammad Shahidur
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (01)
[25] A statistical model with hierarchical structure for predicting prosody in a mandarin text-to-speech system
Yu, MS
Pan, NH
JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2005, 28 (03) : 385 - 399
[26] STRESS PREDICITION FOR MANDARIN TEXT-TO-SPEECH SYSTEM USING DISCOURSE CONTEXT FEATURE
Che, Hao
Tao, Jianhua
2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
[27] An efficient text analyzer with prosody generator-driven approach for mandarin text-to-speech
Hwang, SH
Yeh, CY
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 488 - 491
[28] Efficient text analyser with prosody generator-driven approach for Mandarin text-to-speech
Yeh, CY
Hwang, SH
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2005, 152 (06): : 793 - 799
[29] Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech
Li, Ya
Tao, Jianhua
Hirose, Keikichi
Xu, Xiaoying
Lai, Wei
SPEECH COMMUNICATION, 2015, 72 : 59 - 73
[30] Document Structure Analysis and Text Normalization for Chinese Putonghua and Cantonese Text-to-Speech Synthesis
Zhou, Xinxin
Wu, Zhiyong
Yuan, Chun
Zhong, Yuzhuo
2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL I, PROCEEDINGS, 2008, : 477 - 481

← 1 2 3 4 5 →