Application of different learning methods to Hungarian part-of-speech tagging

被引:0
|
作者
Horvath, T [1 ]
Alexin, Z
Gyimóthy, T
Wrobel, S
机构
[1] GMD AiSKD, German Natl Res Ctr Informat Technol, D-53754 St Augustin, Germany
[2] Attila Jozsef Univ, Dept Appl Informat, H-6701 Szeged, Hungary
[3] Hungarian Acad Sci, Res Grp Artificial Intelligence, H-6720 Szeged, Hungary
[4] Univ Magdeburg, IWS, D-39106 Magdeburg, Germany
来源
INDUCTIVE LOGIC PROGRAMMING | 1999年 / 1634卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
From the point of view of computational linguistics, Hungarian is a difficult language due to its complex grammar and rich morphology. This means that even a common task such as part-of-speech tagging presents a new challenge for learning when looked at for the Hungarian language, especially given the fact that this language has fairly free word order. In this paper we therefore present a case study designed to illustrate the potential and limits of current ILP and non-ILP algorithms on the Hungarian POS-tagging task. We have selected the popular C4.5 and Progol systems as propositional and ILP representatives, adding experiments with our own methods AGLEARN, a C4.5 preprocessor based on attribute grammars, and the ILP approaches PHM and RIBL. The systems were compared on the Hungarian version of the multilingual morphosyntactically annotated MULTEXT-East TELRI corpus-which consists of about 100.000 tokens. Experimental results indicate that Hungarian POS-tagging is indeed a challenging task for learning algorithms, that even simple background knowledge leads to large differences in accuracy, and that instance-based methods are promising approaches to POS tagging also for Hungarian. The paper also includes experiments with some different cascade connections of the taggers.
引用
收藏
页码:128 / 139
页数:12
相关论文
共 50 条
  • [21] Part-of-Speech Tagging for Azerbaijani Language
    Mammadov, Samir
    Rustamov, Samir
    Mustafali, Ali
    Sadigov, Ziyaddin
    Mollayev, Rasim
    Mammadov, Zamir
    [J]. 2018 IEEE 12TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2018, : 40 - 45
  • [22] Part-of-Speech Tagging by Latent Analogy
    Bellegarda, Jerome R.
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (06) : 985 - 993
  • [23] Corpus based part-of-speech tagging
    Lv, Chengyao
    Liu, Huihua
    Dong, Yuanxing
    Chen, Yunliang
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (03) : 647 - 654
  • [24] Domain adaptation in part-of-speech tagging
    Institute of Exact and Natural Sciences, Federal University of Pará , Pará, Brazil
    不详
    [J]. Emerging Applic. of Nat. Lang. Proc.: Concepts and New Res., (52-72):
  • [25] Part-of-speech tagging without training
    Bressan, S
    Indradjaja, LS
    [J]. INTELLIGENCE IN COMMUNICATION SYSTEMS, 2004, 3283 : 112 - 119
  • [26] Semi-Supervised Learning for Part-of-Speech Tagging of Mandarin Transcribed Speech
    Wang, Wen
    Huang, Zhongqiang
    Harper, Mary
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 137 - +
  • [27] A Comparison of Different Part-of-Speech Tagging Technique for Text in Bahasa Indonesia
    Zuli, Ahmad
    Hartanto, Amrullah Rudy
    Mustika, I. Wayan
    [J]. 2017 7TH INTERNATIONAL ANNUAL ENGINEERING SEMINAR (INAES), 2017, : 6 - 10
  • [28] Portuguese Part-of-Speech Tagging Using Entropy Guided Transformation Learning
    dos Santos, Cicero Nogueira
    Milidiu, Ruy L.
    Renteria, Raul P.
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROCEEDINGS, 2008, 5190 : 143 - +
  • [29] Deep Learning Architecture for Part-of-Speech Tagging with Word and Suffix Embeddings
    Popov, Alexander
    [J]. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2016, 2016, 9883 : 68 - 77
  • [30] Using machine learning techniques for part-of-speech tagging in the Greek language
    Petasis, G
    Paliouras, G
    Karkaletsis, V
    Spyropoulos, CD
    Androutsopoulos, I
    [J]. ADVANCES IN INFORMATICS, 2000, : 273 - 281