The linguistic basis of a rule-based tagger of Czech

被引:0
|
作者
Oliva, K [1 ]
Hnátková, M
Petkevic, V
Kveton, P
机构
[1] Univ Saarland, D-6600 Saarbrucken, Germany
[2] Charles Univ, Fac Arts, Prague, Czech Republic
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the conception of a rule-based tagger (part-of-speech disambiguator) of Czech currently developed for tagging the Czech National Corpus (cf. [2]). The input of the tagger consists of sentences whose words are assigned all possible morphological analyses. The tagger disambiguates this input by successive elimination of tags which are syntactically implausible in the sentential context of the particular word. Due to this, the tagger promises substantially higher accuracy than current stochastic taggers for Czech. This is documented by the results concerning the disambiguation of the most frequent ambiguous word form in Czech - the word se.
引用
收藏
页码:3 / 8
页数:6
相关论文
共 50 条
  • [1] A rule-based tagger for Polish based on genetic algorithm
    Piasecki, M
    Gawel, B
    [J]. INTELLIGENT INFORMATION PROCESSING AND WEB MINING, PROCEEDINGS, 2005, : 247 - 255
  • [2] HMM based POS tagger and rule-based chunker for Bengali
    Bandyopadhyay, Sivaji
    Ekbal, Asif
    [J]. PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, 2007, : 384 - +
  • [3] Boosting Statistical Tagger Accuracy with Simple Rule-Based Grammars
    Hulden, Mans
    Francom, Jerid
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2114 - 2117
  • [4] Building an Indonesian Rule-Based Part-of-Speech Tagger
    Rashel, Fam
    Luthfi, Andry
    Dinakaramani, Arawinda
    Manurung, Ruli
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2014), 2014, : 70 - 73
  • [5] Development of Automatic Rule-based Semantic Tagger and Karaka Analyzer for Hindi
    Katyayan, Pragya
    Joshi, Nisheeth
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (02)
  • [6] The Rule-Based Approach to Czech Grammaticalized Alternations
    Kettnerova, Vaclava
    Lopatkova, Marketa
    Uresova, Zdenka
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 158 - 165
  • [7] A Hybrid of Rule-based and HMM-based Part-of-Speech Tagger for Indonesian
    Ananda, Muhammad Ridho
    Hanifmuti, Muhammad Yudistira
    Alfina, Ika
    [J]. 2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 280 - 285
  • [8] Identification of POS Tags for the Khasi Language based on Brill's Transformation Rule-Based Tagger
    Warjri, Sunita
    Pakray, Partha
    Lyngdoh, Saralin A.
    Maji, Arnab Kumar
    [J]. COMPUTACION Y SISTEMAS, 2022, 26 (02): : 989 - 1005
  • [9] Tagging Icelandic text: A linguistic rule-based approach
    Loftsson, Hrafn
    [J]. NORDIC JOURNAL OF LINGUISTICS, 2008, 31 (01) : 47 - 72
  • [10] An efficient part-of-speech tagger rule-based approach of Sanskrit language analysis
    Tapaswi N.
    [J]. International Journal of Information Technology, 2024, 16 (2) : 901 - 908