A Comparison of Data-Driven Automatic Syllabification Methods

被引:0
|
作者
Adsett, Connie R. [1 ]
Marchand, Yannick [1 ]
机构
[1] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 1W5, Canada
关键词
Natural language processing; machine learning; automatic syllabification;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although automatic syllabification is an important component in several natural language tasks, little has been done to compare the results of data-driven methods on a wide range of languages. This article compares the results of five data-driven syllabification algorithms (Hidden Markov Support Vector Machines, IB1, Liang's algorithm, the Look Up Procedure, and Syllabification by Analogy) on nine European languages in order to determine which algorithm performs best over all. Findings show that all algorithms achieve a mean word accuracy across all lexicons of over 90%. However, Syllabification by Analogy performs better than the other algorithms tested with a mean word accuracy of 96.84% (standard deviation of 2.93) whereas Liang's algorithm, the standard for hyphenation (used in TEX), produces the second best results with a mean of 95.67% (standard deviation of 5.70).
引用
收藏
页码:174 / 181
页数:8
相关论文
共 50 条
  • [1] Syllabification rules versus data-driven methods in a language with low syllabic complexity: The case of Italian
    Adsett, Connie R.
    Marchand, Yannick
    Keselj, Vlado
    [J]. COMPUTER SPEECH AND LANGUAGE, 2009, 23 (04): : 444 - 463
  • [2] Comparison of rule-based and data-driven approaches for syllabification of simple syllable languages and the effect of orthography
    Asahiah, Franklin Oladiipo
    [J]. COMPUTER SPEECH AND LANGUAGE, 2021, 70
  • [3] Comparison of Data-Driven Reconstruction Methods For Fault Detection
    Baraldi, Piero
    Di Maio, Francesco
    Genini, Davide
    Zio, Enrico
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2015, 64 (03) : 852 - 860
  • [4] A Comparison of Data-Driven Groundwater Vulnerability Assessment Methods
    Sorichetta, Alessandro
    Ballabio, Cristiano
    Masetti, Marco
    Robinson, Gilpin R., Jr.
    Sterlacchini, Simone
    [J]. GROUND WATER, 2013, 51 (06) : 866 - 879
  • [5] Comparison of Different Methods for Data-driven Respiratory Gating of PET Data
    Thielemans, Kris
    Schleyer, Paul
    Marsden, Paul K.
    Manjeshwar, Ravindra M.
    Wollenweber, Scott D.
    Ganin, Alexander
    [J]. 2013 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE (NSS/MIC), 2013,
  • [6] Comparison of data-driven methods for downscaling ensemble weather forecasts
    Liu, Xiaoli
    Coulibaly, P.
    Evora, N.
    [J]. HYDROLOGY AND EARTH SYSTEM SCIENCES, 2008, 12 (02) : 615 - 624
  • [7] Automatic compilation of data-driven circuits
    Taylor, Sam
    Edwards, Doug
    Plana, Luis
    [J]. ASYNC 2008: 14TH IEEE INTERNATIONAL SYMPOSIUM ON ASYNCHRONOUS CIRCUITS AND SYSTEMS, 2008, : 3 - +
  • [8] Data-driven methods in Rheology
    Ahn, Kyung Hyun
    Jamali, Safa
    [J]. RHEOLOGICA ACTA, 2023, 62 (10) : 473 - 475
  • [9] Data-driven methods in Rheology
    Kyung Hyun Ahn
    Safa Jamali
    [J]. Rheologica Acta, 2023, 62 : 473 - 475
  • [10] Enabling Automatic Repair of Source Code Vulnerabilities Using Data-Driven Methods
    Grishina, Anastasiia
    [J]. 2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2022), 2022, : 275 - 277