FarsBayan: A Unit Selection based Farsi Speech Synthesizer

被引：0

作者：

Homayounpour, M. Mehdi ^{[1
]}

Namnabat, Majid ^{[1
]}

机构：

[1] Amirkabir Univ Technol Tehran Polytech, Comp Engn & Informat Technol Dept, Lab Intelligent Sound & Speech Proc, Tehran, Iran

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

speech synthesis; unit selection; Farsi language; pre-selection using adaptive threshold;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, the unit selection-based concatenative speech synthesis method using a large corpus has attracted great attention. This method provides more natural quality speech compared to the parameter driven methods. The Formant Synthesis, HNM method and use of MLSA filter are the prevalent methods for synthesizing Farsi speech. In this paper, we present the structure of a proposed unit selection synthesizer for Farsi language. In the proposed system, the linear regression method has been used for determination of weights of discrete sub-costs in the target cost, while the weights of other sub-costs have been considered constant. We have also presented a pre-selection algorithm using adaptive threshold for pruning the units. In addition, the efficiency of TD-PSOLA algorithm in improvement of resulting speech quality has been studied. Informal tests show the degrading effect of this algorithm on the output quality. The output speech was found to be remarkably fluent and natural. The quality of the output speech has been evaluated using MOS subjective test, and we have obtained a MOS test value of 3.8 for overall quality.

引用

页码：1336 / 1339

页数：4

共 50 条

[1] Progressive Neural Networks based Features Prediction for the Target Cost in Unit-Selection Speech Synthesizer
Fu, Ruibo
Tao, Jianhua
Wen, Zhengqi
[J]. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 504 - 509
[2] Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
Fu, Ruibo
Tao, Jianhua
Zheng, Yibin
Wen, Zhengqi
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2514 - 2518
[3] Speech unit selection based on matching pursuit
Hosseinpour, M.
Ranjbar, M. N.
Mousavinejad, M.
[J]. 2007 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2007, : 535 - +
[4] Unit selection for speech synthesis based on acoustic criteria
Rouibia, S
Rosec, O
Moudenc, T
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 281 - 287
[5] COMPRESSED SENSING FOR UNIT SELECTION BASED SPEECH SYNTHESIS
Sharma, Pulkit
Abrol, Vinayak
Sao, Anil Kumar
[J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1731 - 1735
[6] Speech unit selection based on target values driven by speech data in concatenative speech synthesis
Hirai, T
Tenpaku, S
Shikano, K
[J]. PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 43 - 46
[7] Minimum unit selection error training for HMM-based unit selection speech synthesis system
Ling, Zhen-Hua
Wang, Ren-Hua
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3949 - 3952
[8] SELECTION OF A FORMANT SYNTHESIZER MODEL FOR TEXT-TO-SPEECH SYNTHESIS
SINCLAIR, DA
[J]. PROCEEDINGS : INSTITUTE OF ACOUSTICS, VOL 8, PART 7: SPEECH & HEARING, 1986, 8 : 363 - 369
[9] Unit selection algorithm for Japanese speech synthesis based on both phoneme unit and diphone unit
Toda, T
Kawai, H
Tsuzaki, M
Shikano, K
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 465 - 468
[10] Unit Selection based Speech Synthesis for Poor Channel Condition
Cen, Ling
Dong, Minghui
Chan, Paul
Li, Haizhou
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2035 - 2038

← 1 2 3 4 5 →