A Flexible Architecture for Urdu Phonemes-Based Concatenative Speech Synthesis

被引:0
|
作者
Ahmad, Muhammad Rizwan [1 ]
Arshad, Muhammad Junaid [1 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci, Lahore, Pakistan
关键词
Articulatory; Text-to-Speech; Formant; Concatenative; Natural Language Processing; Waveforms; Speech Units; Phonemes; Speech Synthesis;
D O I
10.22581/muet1982.1603.07
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
TTS (Text-to-Speech) synthesis systems are extensively used across the world to intensify the accessibility of information and to make it possible for the handicapped to be involved directly with computers to get the benefits from this high technology revolution. Various TTS synthesis techniques have been used with their own advantages and limitations. There is not a concatenative synthesis strategy based architecture for Urdu TTS synthesis system for handling the homographs and to avoid the unnatural robot sounding speech produced due the use of di-phones. In this paper, we propose a flexible architecture for Urdu TTS synthesis system that uses concatenative synthesis strategy because this approach has the ability to join together the small corpus of speech to generate natural and intelligible sound. The main aspiration of this research is to disambiguate the homographs in the Urdu language and to avoid the unnatural robot sounding speech. Finally, the effectiveness of the system is tested in terms of intelligibility and acceptability on word and sentence level. The intelligibility rate is near to 80% and 65% while acceptability rate for the naturalness is 95% (75% natural, 20% acceptable).
引用
收藏
页码:373 / 380
页数:8
相关论文
共 50 条
  • [31] The phase substitutions in Czech harmonic concatenative speech synthesis
    Tychtl, Z
    Matous, K
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 333 - 340
  • [32] A segmental speech coder based on a concatenative TTS
    Lee, KS
    Cox, RV
    SPEECH COMMUNICATION, 2002, 38 (1-2) : 89 - 100
  • [33] Scalable concatenative speech synthesis based on the plural unit selection and fusion method
    Tamura, M
    Mizutani, T
    Kagoshima, T
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 361 - 364
  • [34] An evaluation of automatic phone segmentation for concatenative speech synthesis
    Kawai, H
    Toda, T
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 677 - 680
  • [35] Spectral dynamics as a source of discontinuity in concatenative speech synthesis
    Kirkpatrick, Barry
    O'Brien, Darragh
    Scaife, Ronan
    Errity, Andrew
    PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 615 - +
  • [36] A concatenative speech synthesis for monosyllabic languages with limited data
    Phung, Trung-Nghia
    Luong, Mai Chi
    Akagi, Masato
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [37] Selection in a concatenative speech synthesis system using a large speech database
    Hunt, AJ
    Black, AW
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 373 - 376
  • [38] A Rule-Based Concatenative Approach to Speech Synthesis in Indian Language Text-to-Speech Systems
    Panda, Soumya Priyadarsini
    Nayak, Ajit Kumar
    INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, 2015, 309 : 523 - 531
  • [39] SPEECH SYNTHESIS FROM TRANSITIONS OF ADJACENT PHONEMES
    OLIVE, JP
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1975, 58 : S22 - S23
  • [40] SPEECH SYNTHESIS USING SEGMENTAL AND PROSODIC PHONEMES
    MANDURAH, MM
    JOURNAL OF ENGINEERING SCIENCES, 1985, 11 (01): : 79 - 90