Application of different learning methods to Hungarian part-of-speech tagging

被引:0
|
作者
Horvath, T [1 ]
Alexin, Z
Gyimóthy, T
Wrobel, S
机构
[1] GMD AiSKD, German Natl Res Ctr Informat Technol, D-53754 St Augustin, Germany
[2] Attila Jozsef Univ, Dept Appl Informat, H-6701 Szeged, Hungary
[3] Hungarian Acad Sci, Res Grp Artificial Intelligence, H-6720 Szeged, Hungary
[4] Univ Magdeburg, IWS, D-39106 Magdeburg, Germany
来源
INDUCTIVE LOGIC PROGRAMMING | 1999年 / 1634卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
From the point of view of computational linguistics, Hungarian is a difficult language due to its complex grammar and rich morphology. This means that even a common task such as part-of-speech tagging presents a new challenge for learning when looked at for the Hungarian language, especially given the fact that this language has fairly free word order. In this paper we therefore present a case study designed to illustrate the potential and limits of current ILP and non-ILP algorithms on the Hungarian POS-tagging task. We have selected the popular C4.5 and Progol systems as propositional and ILP representatives, adding experiments with our own methods AGLEARN, a C4.5 preprocessor based on attribute grammars, and the ILP approaches PHM and RIBL. The systems were compared on the Hungarian version of the multilingual morphosyntactically annotated MULTEXT-East TELRI corpus-which consists of about 100.000 tokens. Experimental results indicate that Hungarian POS-tagging is indeed a challenging task for learning algorithms, that even simple background knowledge leads to large differences in accuracy, and that instance-based methods are promising approaches to POS tagging also for Hungarian. The paper also includes experiments with some different cascade connections of the taggers.
引用
收藏
页码:128 / 139
页数:12
相关论文
共 50 条
  • [1] Application of Stacked Methods to Part-of-Speech Tagging of Polish
    Kuta, Marcin
    Wojcik, Wojciech
    Wrzeszcz, Michal
    Kitowski, Jacek
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I, 2010, 6067 : 340 - 349
  • [2] Revision learning and its application to part-of-speech tagging
    Nakagawa, T
    Kudo, T
    Matsumoto, Y
    [J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 497 - 504
  • [3] The Application of CRFs in Part-of-Speech Tagging
    Zhang Xiaofei
    Huang Heyan
    Zhang Liang
    [J]. 2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 347 - +
  • [4] Part-of-speech tagging
    Martinez, Angel R.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2012, 4 (01): : 107 - 113
  • [5] Part-of-Speech Tagging Using Multiview Learning
    Lim, Kyungtae
    Park, Jungyeul
    [J]. IEEE ACCESS, 2020, 8 : 195184 - 195196
  • [6] Sequential Alignment Methods for Ensemble Part-of-Speech Tagging
    Jung, Jeesu
    Jung, Sangkeun
    Roh, Yoon-hyung
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022), 2022, : 175 - 181
  • [7] Part-of-speech tagging for Swedish
    Prütz, K
    [J]. PARALLEL CORPORA, PARALLEL WORLDS, 2002, (43): : 201 - 206
  • [8] Improving Part-of-Speech Tagging by Meta-learning
    Kobylinski, Lukasz
    Wasiluk, Michal
    Wojdyga, Grzegorz
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 144 - 152
  • [9] Reducing Confusion in Active Learning for Part-Of-Speech Tagging
    Chaudhary, Aditi
    Anastasopoulos, Antonios
    Sheikh, Zaid
    Neubig, Graham
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 1 - 16
  • [10] Deep Learning Model for Tamil Part-of-Speech Tagging
    Visuwalingam, Hemakasiny
    Sakuntharaj, Ratnasingam
    Alawatugoda, Janaka
    Ragel, Roshan
    [J]. COMPUTER JOURNAL, 2024, 67 (08): : 2633 - 2642