A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis

被引:0
|
作者
Ahmad, Muhammad Rizwan [1 ]
Arshad, Muhammad Junaid [1 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci, Lahore, Pakistan
关键词
Articulatory; Text-to-Speech; Formant; Concatenative; Natural Language Processing; Waveforms; Speech Units; Phonemes; Speech Synthesis;
D O I
10.22581/muet1982.1603.07
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
TTS (Text-to-Speech) synthesis systems are extensively used across the world to intensify the accessibility of information and to make it possible for the handicapped to be involved directly with computers to get the benefits from this high technology revolution. Various TTS synthesis techniques have been used with their own advantages and limitations. There is not a concatenative synthesis strategy based architecture for Urdu TTS synthesis system for handling the homographs and to avoid the unnatural robot sounding speech produced due the use of di-phones. In this paper, we propose a flexible architecture for Urdu TTS synthesis system that uses concatenative synthesis strategy because this approach has the ability to join together the small corpus of speech to generate natural and intelligible sound. The main aspiration of this research is to disambiguate the homographs in the Urdu language and to avoid the unnatural robot sounding speech. Finally, the effectiveness of the system is tested in terms of intelligibility and acceptability on word and sentence level. The intelligibility rate is near to 80% and 65% while acceptability rate for the naturalness is 95% (75% natural, 20% acceptable).
引用
收藏
页码:373 / 380
页数:8
相关论文
共 50 条
  • [21] Concatenative speech synthesis based on the plural unit selection and fusion method
    Mizutani, T
    Kagoshima, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (11): : 2565 - 2572
  • [22] Automatic Labeling Schemes for Concatenative Speech Synthesis
    Kacur, Juraj
    Cepko, Jozef
    Palenik, Andrej
    PROCEEDINGS ELMAR-2008, VOLS 1 AND 2, 2008, : 639 - 642
  • [23] SPEECH SEGMENT SELECTION FOR CONCATENATIVE SYNTHESIS BASED ON SPECTRAL DISTORTION MINIMIZATION
    IWAHASHI, N
    KAIKI, N
    SAGISAKA, Y
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1942 - 1948
  • [24] Acoustic speech unit segmentation for concatenative synthesis
    Torres, H. M.
    Gurlekian, J. A.
    COMPUTER SPEECH AND LANGUAGE, 2008, 22 (02): : 196 - 206
  • [25] Control of spectral dynamics in concatenative speech synthesis
    Wouters, J
    Macon, MW
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (01): : 30 - 38
  • [26] Nonlinear speech features for the objective detection of discontinuities in concatenative speech synthesis
    Pantazis, Y
    Stylianou, Y
    NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 375 - 383
  • [27] Unit database pruning based on the cost degradation criterion for concatenative speech synthesis
    Nishizawa, Nobuyuki
    Kawai, Hisashi
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3969 - 3972
  • [28] Development of Concatenative Syllable-Based Text to Speech Synthesis System for Tamil
    Sudhakar, B.
    Bensraj, R.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 1, 2015, 324 : 585 - 592
  • [29] Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence
    Adell, Jordi
    Escudero, David
    Bonafonte, Antonio
    SPEECH COMMUNICATION, 2012, 54 (03) : 459 - 476
  • [30] Context-adaptive smoothing for concatenative speech synthesis
    Lee, KS
    Kim, SR
    IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (12) : 422 - 425