A quantitative method for modeling context in concatenative synthesis using large speech database

被引:0
|
作者
Hamza, W
Rashwan, M
Afify, M
机构
来源
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM | 2001年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Modeling phonetic context is one of the key points to get natural sounding in concatenative speech synthesis. In this paper, a new quantitative method to model context has been proposed. In the proposed method, the context is measured as the distance between leafs of the top-down likelihood-based decision trees that have been grown during the construction of acoustic inventory. Unlike other context modeling methods, this method allows the unit selection algorithm to borrow unit occurrences from other contexts when their context distances are close. This is done by incorporating the measured distance as an element in the unit selection cost function. The motivation behind this method is that it reduces the required speech modification by using better unit occurrences from near context. This method also makes it easy to use long synthesis units, e.g. syllables or words, in the same unit selection framework.
引用
收藏
页码:789 / 792
页数:4
相关论文
共 50 条
  • [21] Speech Synthesis Using Compressed Database
    Rybarova, R.
    Rozinaj, G.
    PROCEEDINGS OF ELMAR-2015 57TH INTERNATIONAL SYMPOSIUM ELMAR-2015, 2015, : 105 - 108
  • [22] Marathi Language Speech Synthesizer Using Concatenative Synthesis Strategy (Spoken in Maharashtra, India)
    Shirbahadurkar, S. D.
    Bormane, D. S.
    2009 SECOND INTERNATIONAL CONFERENCE ON MACHINE VISION, PROCEEDINGS, ( ICMV 2009), 2009, : 181 - +
  • [23] Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition
    Kanthak, S
    Ney, H
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 845 - 848
  • [24] Analysis of Vietnamese Tones to Optimize Database in Speech Synthesis Using Unit Selection Method
    Vu Due Lung
    Nguyen Phuoe Loe
    Cao Van Hung
    Nguyen Viet Quoe
    2012 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2012, : 43 - 48
  • [25] Quantitative intonation modeling of interrogative sentences for Mandarin speech synthesis
    Li, Ya
    Tao, Jianhua
    Lai, Wei
    Xu, Xiaoying
    SPEECH COMMUNICATION, 2017, 89 : 92 - 102
  • [26] UNSUPERVISED PROSODIC PHRASE BOUNDARY LABELING OF MANDARIN SPEECH SYNTHESIS DATABASE USING CONTEXT-DEPENDENT HMM
    Yang, Chen-Yu
    Ling, Zhen-Hua
    Dai, Li-Rong
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6875 - 6879
  • [27] Speech bandwidth extension method using speech recognition and speech synthesis
    Takashina, Masashi
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    2006 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2006, : 1273 - +
  • [28] Implementation of Speech Synthesis based on HMM using PADAS database
    Khalil, Krichi Mohamed
    Adnan, Cherif
    2015 IEEE 12TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2015,
  • [29] A statistical method for database reduction for embedded unit selection speech synthesis
    Tsiakoulis, Pirros
    Chalamandaris, Aimilios
    Karabetsos, Sotiris
    Raptis, Spyros
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4601 - 4604
  • [30] Automatic Phrase Boundary Labeling of Speech Synthesis Database Using Context-Dependent HMMs and N-Gram Prior Distributions
    Chen, Qian
    Ling, Zhen-Hua
    Yang, Chen-Yu
    Dai, Li-Rong
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1581 - 1585