Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic

被引:2
|
作者
Houidhek, Amal [1 ,2 ]
Colotte, Vincent [2 ]
Mnasri, Zied [1 ]
Jouvet, Denis [2 ]
机构
[1] Univ Tunis El Manar, Ecole Natl Ingenieurs Tunis, Elect Engn Dept, Tunis, Tunisia
[2] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
关键词
Parametric speech synthesis; Statistical modelling; Arabic language; Speech unit modelling; Vowel quantity; Gemination;
D O I
10.1007/s10772-018-09558-6
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper investigates the use of hidden Markov models (HMM) for Modern Standard Arabic speech synthesis. HMM-based speech synthesis systems require a description of each speech unit with a set of contextual features that specifies phonetic, phonological and linguistic aspects. To apply this method to Arabic language, a study of its particularities was conducted to extract suitable contextual features. Two phenomena are highlighted: vowel quantity and gemination. This work focuses on how to model geminated consonants (resp. long vowels), either considering them as fully-fledged phonemes or as the same phonemes as their simple (resp. short) counterparts but with a different duration. Four modelling approaches have been proposed for this purpose. Results of subjective and objective evaluations show that there is no important difference between differentiating modelling units associated to geminated consonants (resp. long vowels) from modelling units associated to simple consonants (resp. short vowels) and merging them as long as gemination and vowel quantity information is included in the set of features.
引用
收藏
页码:895 / 906
页数:12
相关论文
共 50 条
  • [31] State duration modeling for HMM-based speech synthesis
    Zen, Heiga
    Masuko, Takashi
    Tokuda, Keiichi
    Yoshimura, Takayoshi
    Kobayasih, Takao
    Kitamura, Tadashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (03): : 692 - 693
  • [32] Analysis and HMM-based synthesis of hypo and hyperarticulated speech
    Picart, Benjamin
    Drugman, Thomas
    Dutoit, Thierry
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (02): : 687 - 707
  • [33] Optimal Number of States in HMM-Based Speech Synthesis
    Hanzlicek, Zdenek
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 353 - 361
  • [34] Pitch dependent phone modelling for HMM-based speech recognition
    Singer, H.
    Sagayama, S.
    [J]. Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 1994, 15 (02):
  • [35] Minimum unit selection error training for HMM-based unit selection speech synthesis system
    Ling, Zhen-Hua
    Wang, Ren-Hua
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3949 - 3952
  • [36] A trainable excitation model for HMM-based speech synthesis
    Maia, R.
    Toda, T.
    Zen, H.
    Nankaku, Y.
    Tokuda, K.
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1125 - +
  • [37] Speaker interpolation for HMM-based speech synthesis system
    [J]. Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):
  • [38] Contextual Additive Structure for HMM-Based Speech Synthesis
    Takaki, Shinji
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 229 - 238
  • [39] Parameterization of Vocal Fry in HMM-Based Speech Synthesis
    Silen, Hanna
    Helander, Elina
    Nurminen, Jani
    Gabbouj, Moncef
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1735 - +
  • [40] Outlier Detection and Removal for HMM-Based Speech Synthesis with an Insufficient Speech Database
    Hong, Doo Hwa
    Sung, June Sig
    Oh, Kyung Hwan
    Kim, Nam Soo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (09) : 2351 - 2354