Boosting Statistical Tagger Accuracy with Simple Rule-Based Grammars

被引:0
|
作者
Hulden, Mans [1 ]
Francom, Jerid [1 ]
机构
[1] Wake Forest Univ, Ikerbasque Basque Fdn Sci, Winston Salem, NC 27109 USA
关键词
part-of-speech tagging; constraint grammar; hybrid POS tagging; HMM taggers; Spanish;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
We report on several experiments on combining a rule-based tagger and a trigram tagger for Spanish. The results show that one can boost the accuracy of the best performing n-gram taggers by quickly developing a rough rule-based grammar to complement the statistically induced one and then combining the output of the two. The specific method of combination is crucial for achieving good results. The method provides particularly large gains in accuracy when only a small amount of tagged data is available for training a HMM, as may be the case for lesser-resourced and minority languages.
引用
收藏
页码:2114 / 2117
页数:4
相关论文
共 50 条
  • [1] The linguistic basis of a rule-based tagger of Czech
    Oliva, K
    Hnátková, M
    Petkevic, V
    Kveton, P
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 3 - 8
  • [2] A rule-based tagger for Polish based on genetic algorithm
    Piasecki, M
    Gawel, B
    [J]. INTELLIGENT INFORMATION PROCESSING AND WEB MINING, PROCEEDINGS, 2005, : 247 - 255
  • [3] HMM based POS tagger and rule-based chunker for Bengali
    Bandyopadhyay, Sivaji
    Ekbal, Asif
    [J]. PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, 2007, : 384 - +
  • [4] Building an Indonesian Rule-Based Part-of-Speech Tagger
    Rashel, Fam
    Luthfi, Andry
    Dinakaramani, Arawinda
    Manurung, Ruli
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2014), 2014, : 70 - 73
  • [5] APPLICATION OF GRAPH-GRAMMARS TO RULE-BASED SYSTEMS
    KORFF, M
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1991, 532 : 505 - 519
  • [6] Development of Automatic Rule-based Semantic Tagger and Karaka Analyzer for Hindi
    Katyayan, Pragya
    Joshi, Nisheeth
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (02)
  • [7] FROM RULE-BASED TO STATISTICAL GRAMMARS: CONTINUOUS IMPROVEMENT OF LARGE-SCALE SPOKEN DIALOG SYSTEMS
    Suendermann, D.
    Evanini, K.
    Liscombe, J.
    Hunter, P.
    Dayanidhi, K.
    Pieraccini, R.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4713 - 4716
  • [8] A COMBINED STATISTICAL AND RULE-BASED CLASSIFIER
    TIEN, D
    NICKOLLS, P
    [J]. IMAGES OF THE TWENTY-FIRST CENTURY, PTS 1-6, 1989, 11 : 1829 - 1829
  • [9] A Hybrid of Rule-based and HMM-based Part-of-Speech Tagger for Indonesian
    Ananda, Muhammad Ridho
    Hanifmuti, Muhammad Yudistira
    Alfina, Ika
    [J]. 2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 280 - 285
  • [10] Identification of POS Tags for the Khasi Language based on Brill's Transformation Rule-Based Tagger
    Warjri, Sunita
    Pakray, Partha
    Lyngdoh, Saralin A.
    Maji, Arnab Kumar
    [J]. COMPUTACION Y SISTEMAS, 2022, 26 (02): : 989 - 1005