Phonetic analysis and automatic prediction of vowel duration in Hungarian spontaneous speech

被引:0
|
作者
Beke, Andras [1 ]
Gosy, Maria [1 ]
机构
[1] Hungarian Acad Sci Phonet, Res Inst Linguist, Budapest, Hungary
来源
关键词
D O I
10.3233/IDT-140198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A large number of phonetic and phonology research papers analyzed segmental durations focusing on factors and interactions that determine their durations. The results often play an important role in Language Technology applications, for example in TTS (text-to-speech synthesis), ASR (automatic speech recognition) and are widely used in infocommunications today. Both the research theory behind this topic, and its applications in infocommunications increasingly rely on the results of the cognitive sciences. This fact encourages scientists to discuss these results within the field of cognitive infocommunications (CogInfoCom) as supporting elements of novel kinds of cognitive capabilities of infocommunication systems. Speech sound duration depends on various factors such as phonetic quality, phonological context, phonological position in the word or in the utterance, speech style, etc. The multifunction dependence of vowel duration is more complex in those languages where vowel length is a distinctive feature like in Hungarian. The main goal of the present research was to analyze the physical durations of pairs of vowels in spontaneous speech that exhibit a phonological length opposition. In addition, we intended to develop an algorithm for automatic classification of the short and long vowels occurring in spontaneous speech. On the basis of these findings we intended to predict automatically the vowel durations based on three different methods. The measured data confirmed our hypothesis that phonologically short vs. long vowels would significantly differ in their physical durations in spontaneous speech. The results of the automatic vowel length classification also supported this finding. The third aspect of our investigations was to use different supervised learning methods in order to predict vowel duration, based on different feature vectors consisting of characteristic and spectral features. The best result was yielded by the combined features and FFNN were used. The correlation between the target and the predicted vowel duration was 0.79 while RMSE was 25 ms. The results obtained support the complexity of features affecting vowel duration, on the one hand, and indicate the temporal complexity of segments in spontaneous speech, as has been reported for Lithuanian, Czech, Hindi, Telugu and Korean, on the other hand.
引用
收藏
页码:301 / 314
页数:14
相关论文
共 50 条
  • [31] Vowel quality in aphasia and apraxia of speech: Phonetic transcription and formant analyses
    Haley, KL
    Ohde, RN
    Wertz, RT
    APHASIOLOGY, 2001, 15 (12) : 1107 - 1123
  • [32] Automatic phonetic transcription of large speech corpora
    Van Bael, Christophe
    Boves, Lou
    van den Heuvel, Henk
    Strik, Helmer
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (04): : 652 - 668
  • [33] Automatic Phonetic Segmentation of Spanish Emotional Speech
    Gallardo-Antolin, A.
    Barra, R.
    Schroeder, M.
    Krstulovic, S.
    Montero, J. M.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2376 - +
  • [34] Perceptual speech processing and phonetic feature mapping for robust vowel recognition
    Bu, LK
    Chiueh, TD
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (02): : 105 - 114
  • [35] Automatic phonetic segmentation of Malay speech database
    Ting, Chee-Ming
    Salleh, Sh-Hussain
    Tan, Tian-Swee
    Ariff, A. K.
    2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 194 - 197
  • [36] EFFECTS OF SPEECH RATE, PHONETIC BACKGROUND AND GENDER ON VOWEL REDUCTION IN THE SPEECH OF NONNATIVE SPEAKERS OF ENGLISH
    Brzezicha, Bogna
    Kul, Malgorzata
    POZNAN STUDIES IN CONTEMPORARY LINGUISTICS, 2014, 50 (04) : 397 - 417
  • [37] VOWEL DURATION IN THE SPEECH OF HEARING AND DEAF-CHILDREN
    REILLY, AP
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 : S69 - S69
  • [38] VOWEL DURATION IN MOTHERS SPEECH TO YOUNG-CHILDREN
    SWANSON, LA
    LEONARD, LB
    GANDOUR, J
    JOURNAL OF SPEECH AND HEARING RESEARCH, 1992, 35 (03): : 617 - 625
  • [39] Phonetic correlates to Khalkha Mongolian vowel contrasts: duration, formants and voice quality
    Kenstowicz, Michael J.
    JOURNAL OF EAST ASIAN LINGUISTICS, 2023, 32 (04) : 523 - 549
  • [40] Study of the formant and duration in Chinese whispered vowel speech
    Yue, Zhao
    Wei, Lin
    APPLIED ACOUSTICS, 2016, 114 : 240 - 243