Phonetic analysis and automatic prediction of vowel duration in Hungarian spontaneous speech

被引:0
|
作者
Beke, Andras [1 ]
Gosy, Maria [1 ]
机构
[1] Hungarian Acad Sci Phonet, Res Inst Linguist, Budapest, Hungary
来源
关键词
D O I
10.3233/IDT-140198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A large number of phonetic and phonology research papers analyzed segmental durations focusing on factors and interactions that determine their durations. The results often play an important role in Language Technology applications, for example in TTS (text-to-speech synthesis), ASR (automatic speech recognition) and are widely used in infocommunications today. Both the research theory behind this topic, and its applications in infocommunications increasingly rely on the results of the cognitive sciences. This fact encourages scientists to discuss these results within the field of cognitive infocommunications (CogInfoCom) as supporting elements of novel kinds of cognitive capabilities of infocommunication systems. Speech sound duration depends on various factors such as phonetic quality, phonological context, phonological position in the word or in the utterance, speech style, etc. The multifunction dependence of vowel duration is more complex in those languages where vowel length is a distinctive feature like in Hungarian. The main goal of the present research was to analyze the physical durations of pairs of vowels in spontaneous speech that exhibit a phonological length opposition. In addition, we intended to develop an algorithm for automatic classification of the short and long vowels occurring in spontaneous speech. On the basis of these findings we intended to predict automatically the vowel durations based on three different methods. The measured data confirmed our hypothesis that phonologically short vs. long vowels would significantly differ in their physical durations in spontaneous speech. The results of the automatic vowel length classification also supported this finding. The third aspect of our investigations was to use different supervised learning methods in order to predict vowel duration, based on different feature vectors consisting of characteristic and spectral features. The best result was yielded by the combined features and FFNN were used. The correlation between the target and the predicted vowel duration was 0.79 while RMSE was 25 ms. The results obtained support the complexity of features affecting vowel duration, on the one hand, and indicate the temporal complexity of segments in spontaneous speech, as has been reported for Lithuanian, Czech, Hindi, Telugu and Korean, on the other hand.
引用
收藏
页码:301 / 314
页数:14
相关论文
共 50 条
  • [41] Study of the formant and duration in Chinese whispered vowel speech
    Yue, Zhao
    Wei, Lin
    PROCEEDINGS OF THE 2016 6TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS, ENVIRONMENT, BIOTECHNOLOGY AND COMPUTER (MMEBC), 2016, 88 : 2306 - 2309
  • [42] INFLUENCE OF POSTVOCALIC CONSONANTS ON VOWEL DURATION IN ESOPHAGEAL SPEECH
    GANDOUR, J
    WEINBERG, B
    RUTKOWSKI, D
    LANGUAGE AND SPEECH, 1980, 23 (APR-) : 149 - 158
  • [43] Phonetic correlates to Khalkha Mongolian vowel contrasts: duration, formants and voice quality
    Michael J. Kenstowicz
    Journal of East Asian Linguistics, 2023, 32 : 523 - 549
  • [44] Phonetic reduction and paradigm uniformity effects in spontaneous speech
    Engemann, U. Marie
    Plag, Ingo
    MENTAL LEXICON, 2021, 16 (01): : 165 - 198
  • [45] Phonetic variability of stops and flaps in spontaneous and careful speech
    Warner, Natasha
    Tucker, Benjamin V.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (03): : 1606 - 1617
  • [46] Modelling vowel inherent spectral change in spontaneous speech
    Sims, Michelle
    Tucker, Benjamin V.
    Nearey, Terrance M.
    Canadian Acoustics - Acoustique Canadienne, 2012, 40 (03): : 36 - 37
  • [47] Validation of phonetic transcriptions in the context of automatic speech recognition
    Van Bael, Christophe
    van den Heuvel, Henk
    Strik, Helmer
    LANGUAGE RESOURCES AND EVALUATION, 2007, 41 (02) : 129 - 146
  • [48] Discovering phonetic inventories with crosslingual automatic speech recognition
    Zelasko, Piotr
    Feng, Siyuan
    Velazquez, Laureano Moro
    Abavisani, Ali
    Bhati, Saurabhchand
    Scharenborg, Odette
    Hasegawa-Johnson, Mark
    Dehak, Najim
    COMPUTER SPEECH AND LANGUAGE, 2022, 74
  • [49] Lexical and Phonetic Modeling for Arabic Automatic Speech Recognition
    Nguyen, Long
    Ng, Tim
    Nguyen, Kham
    Zbib, Rabih
    Makhoul, John
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 708 - +
  • [50] Automatic Speech Segmentation Using the Arabic Phonetic Database
    Al-Manie, Mohammed A.
    Alkanhal, Mohammed I.
    Al-Ghamdi, Mansour M.
    RECENT ADVANCES IN AUTOMATION & INFORMATION: PROCEEDINGS OF THE 10TH WSEAS INTERNATIONAL CONFERENCE ON AUTOMATION & INFORMATION (ICAI'09), 2009, : 76 - +