Automated grapheme-to-phoneme conversion for Central Kurdish based on optimality theory

被引:4
|
作者
Mahmudi, Aso [1 ]
Veisi, Hadi [1 ]
机构
[1] Univ Tehran, Fac New Sci & Technol, Tehran, Iran
来源
关键词
Grapheme-to-phoneme conversion; Optimality Theory; Central Kurdish; Kurdish phonology;
D O I
10.1016/j.csl.2021.101222
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The writing system of Central Kurdish features three cases in which there is no one-to-one mapping between the orthographical letters and the phonemes of the language. Consequently, the written words including these cases may be pronounced in multiple ways. The process of finding the correct pronunciation of written words is called Grapheme-to-Phoneme (G2P) conversion and is a key step in natural language processing tasks such as speech synthesis. As Central Kurdish is a low-resourced language, we present a G2P conversion method based on the phonological rules of the language, rather than pronunciation dictionaries and data-driven learning methods. After reviewing the phonology and alphabet of the language through the framework of Optimality Theory, we generate all possible pronunciations. Then, by specifying and applying ranked constraints, we eliminate undesirable candidates so as to keep only one well-formed pronunciation per word. The evaluation of our proposed method on two datasets resulted in 0.75% of overall Phoneme Error Rate (PER) and achieved 94.71% precision in the detection of the short vowel /i/ and 100% of accuracy in the conversion of the letters "(sic)" and "(sic)". Analyzing these results suggests that there is no need for additional new letters in the current orthographic system of Central Kurdish. This approach also enables us to have a ranked suggestion list for the manual checking of the few unresolved ambiguous situations. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Automated Grapheme-to-Phoneme Conversion System for Romanian
    Jozsef, Domokos
    Ovidiu, Buza
    Gavril, Toderean
    [J]. 2011 6TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2011,
  • [2] Transformer based Grapheme-to-Phoneme Conversion
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. INTERSPEECH 2019, 2019, : 2095 - 2099
  • [3] Fast Bilingual Grapheme-To-Phoneme Conversion
    Kim, Hwa-Yeon
    Kim, Jong-Hwan
    Kim, Jae-Min
    [J]. 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 289 - 296
  • [4] A Survey of Grapheme-to-Phoneme Conversion Methods
    Cheng, Shiyang
    Zhu, Pengcheng
    Liu, Jueting
    Wang, Zehua
    [J]. Applied Sciences (Switzerland), 2024, 14 (24):
  • [5] A Rule-Based Grapheme-to-Phoneme Conversion System
    Klosowski, Piotr
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (05):
  • [6] Example-Based Grapheme-to-Phoneme Conversion for Thai
    Charoenpornsawat, Paisarn
    Schultz, Tanja
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1268 - 1271
  • [7] Grapheme-to-Phoneme Conversion with Convolutional Neural Networks
    Yolchuyeva, Sevinj
    Nemeth, Geza
    Gyires-Toth, Balint
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (06):
  • [8] Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion
    Prabhu, Nikhil
    Kann, Katharina
    [J]. 17TH SIGMORPHON WORKSHOP ON COMPUTATIONAL RESEARCH IN PHONETICS PHONOLOGY, AND MORPHOLOGY (SIGMORPHON 2020), 2020, : 123 - 127
  • [9] Grapheme-to-phoneme conversion in Chinese TTS system
    Dong, HH
    Tao, JH
    Xu, B
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 165 - 168
  • [10] Label Embedding for Chinese Grapheme-to-Phoneme Conversion
    Choi, Eunbi
    Kim, Hwa-Yeon
    Kim, Jong-Hwan
    Kim, Jae-Min
    [J]. INTERSPEECH 2021, 2021, : 4094 - 4098