An embedded English synthesis approach based on speech concatenation and smoothing

被引：0

作者：

Chen, GL ^{[1
]}

Yue, DJ ^{[1
]}

Zu, YQ ^{[1
]}

Yu, ZL ^{[1
]}

机构：

[1] Motorola Labs, China Res Ctr, Shanghai, Peoples R China

来源：

2004 International Symposium on Chinese Spoken Language Processing, Proceedings | 2004年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An embedded English synthesis approach based on speech concatenation and smoothing is described. This approach adopts phonetic sub-words as carrier of variable-length units. We define 5-class units to cover all English phonetic phenomena. The corresponding cost function and search procedure based on dynamic programming are addressed in the unit-selection stage. Vocal tract response, pitch value and phase are interpolated and merged at concatenating points for smoothing speech in the synthesis stage. The preliminary test shows that this approach can reach a good balance of naturalness, intelligibility and data footprint.

引用

页码：157 / 160

页数：4

共 50 条

[31] Speech concatenation and synthesis using an overlap-add sinusoidal model
Macon, MW
Clements, MA
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 361 - 364
[32] Speech driven face animation based on dynamic concatenation model
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
[J]. J. Inf. Comput. Sci, 2007, 1 (271-280):
[33] HIGH-QUALITY SPEECH SYNTHESIS SYSTEM BASED ON WAVE-FORM CONCATENATION OF PHONEME SEGMENT
HIROKAWA, T
ITOH, K
SATO, H
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1964 - 1970
[34] An HMM-based speech synthesis system applied to English
Tokuda, K
Zen, H
Black, AW
[J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 227 - 230
[35] Modulation spectrum-based speech parameter trajectory smoothing for DNN-based speech synthesis using FFT spectra
Takamichi, Shinnosuke
[J]. 2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1308 - 1311
[36] A Realistic Visual Speech Synthesis for Indonesian Using a Combination of Morphing Viseme and Syllable Concatenation Approach to Support Pronunciation Learning
Aripin
Haryanto, Hanny
Sumpeno, Surya
[J]. INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2018, 13 (08): : 19 - 37
[37] Context-adaptive smoothing for concatenative speech synthesis
Lee, KS
Kim, SR
[J]. IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (12) : 422 - 425
[38] Voice synthesis application based on syllable concatenation
Buza, O.
Toderean, G. L.
Domokos, J.
Bodo, A. Zs.
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS (AQTR 2008), THETA 16TH EDITION, VOL III, PROCEEDINGS, 2008, : 473 - 478
[39] Natural Sounding Sub-word Units Concatenation in Malay Speech Synthesis
Tiun, Sabrina
Abdullah, Rosni
Kong, Tang Enya
[J]. PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON SIGNAL ACQUISITION AND PROCESSING, 2009, : 77 - +
[40] English speech synthesis using CART-based unit selection
Pei, Dingyu
Chai, Peiqi
Zeng, Lingping
[J]. Jisuanji Gongcheng/Computer Engineering, 2006, 32 (03): : 223 - 225

← 1 2 3 4 5 →