Indonesian graphemic syllabification using a nearest neighbour classifier and recovery procedure

被引:0
|
作者
Edwina Anky Parande
Suyanto Suyanto
机构
[1] Telkom University,School of Computing
关键词
Bahasa Indonesia; Fuzzy; Graphemic syllabification; Nearest neighbour; Phonotactic rules; Speech recognition;
D O I
暂无
中图分类号
学科分类号
摘要
An automatic syllabification, decomposing a word into syllables, is an important part in an automatic speech recognition (ASR) that uses both syllable-based acoustic and language models. It can be performed to either phoneme or grapheme sequences. The phonemic syllabification is more complex than the other since it requires a grapheme-to-phoneme conversion (G2P) as a previous process. It generally gives a high accuracy for many formal words but its accuracy may decrease for person-names. In contrast, the graphemic syllabification is simpler and more potential to be applied for person-names. This research focuses on developing a model of graphemic syllabification using a combination of phonotactic rules and Fuzzy k-nearest neighbour in every Class (FkNNC). The phonotactic rules are designed to find some deterministic syllabification points while FkNNC, as a statistical classifier, is expected to search the remaining stochastic syllabification points. A recovery procedure is proposed to correct the wrong syllabification points produced by FkNNC. Fivefold cross-validating on a dataset of 50k formal words, selected from the great dictionary of the Indonesian language, shows that the proposed model gives syllable error rate (SER) of 2.48% and the proposed recovery procedure reduces the SER to be 2.27%, which is higher than that produced by the phonemic syllabification (only 0.99%). But, this model is capable of handling a dataset of 15k high variance person-names with SER of 7.45% and the proposed recovery procedure reduces the SER to be 6.78%.
引用
收藏
页码:13 / 20
页数:7
相关论文
共 50 条
  • [1] Indonesian graphemic syllabification using a nearest neighbour classifier and recovery procedure
    Parande, Edwina Anky
    Suyanto, Suyanto
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (01) : 13 - 20
  • [2] Indonesian syllabification using a pseudo nearest neighbour rule and phonotactic knowledge
    Suyanto, Suyanto
    Hartati, Sri
    Harjoko, Agus
    Van Compernolle, Dirk
    [J]. SPEECH COMMUNICATION, 2016, 85 : 109 - 118
  • [3] Indonesian syllabification using a pseudo nearest neighbour rule and phonotactic knowledge (vol 85, pg 109, 2016)
    Suyanto, Suyanto
    Hartati, Sri
    Harjoko, Agus
    Van Compemolle, Dirk
    [J]. SPEECH COMMUNICATION, 2017, 90 : 47 - 47
  • [4] A pseudo nearest centroid neighbour classifier
    Ma, Hongxing
    Gou, Jianping
    Wang, Xili
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2018, 17 (01) : 55 - 68
  • [5] A LEARNING SCHEME FOR NEAREST NEIGHBOUR CLASSIFIER
    FORD, NL
    BATCHELOR, BG
    WILKINS, BR
    [J]. INFORMATION SCIENCES, 1970, 2 (02) : 139 - +
  • [6] Using Hellinger distance in a nearest neighbour classifier for relational databases
    Lee, CH
    Shin, DG
    [J]. KNOWLEDGE-BASED SYSTEMS, 1999, 12 (07) : 363 - 370
  • [7] GENIFER:: A nearest neighbour based classifier system using GA
    Fàbrega, FXLI
    Guiu, JMGI
    [J]. GECCO-99: PROCEEDINGS OF THE GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 1999, : 797 - 797
  • [8] Efficient pattern synthesis for nearest neighbour classifier
    Agrawal, M
    Gupta, N
    Shreelekshmi, R
    Murty, MN
    [J]. PATTERN RECOGNITION, 2005, 38 (11) : 2200 - 2203
  • [9] An invariant large margin nearest neighbour classifier
    Kumar, M. Pawan
    Torr, P. H. S.
    Zisserman, A.
    [J]. 2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 1665 - 1672
  • [10] Handwritten Digit Recognition Using K-Nearest Neighbour Classifier
    Babu, U. Ravi
    Venkateswarlu, Y.
    Chintha, Aneel Kumar
    [J]. 2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014), 2014, : 60 - +