Improving part of speech disambiguation rules by adding linguistic knowledge

被引:0
|
作者
Lindberg, N [1 ]
Eineborg, M
机构
[1] Royal Inst Technol, Dept Speech Mus & Hearing, Ctr Speech Technol, Stockholm, Sweden
[2] Stockholm Univ, Royal Inst Technol, Dept Comp Sci & Syst, Machine Learning Grp, S-10691 Stockholm, Sweden
来源
INDUCTIVE LOGIC PROGRAMMING | 1999年 / 1634卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reports the ongoing work of producing a state of the art part of speech tagger for unedited Swedish text. Rules eliminating faulty tags have been induced using Progol. In previously reported experiments, almost no linguistically motivated background knowledge was used [5, 8]. Still, the result was rather promising (recall 97.7%, with a pending average ambiguity of 1.13 tags/word). Compared to the previous study, a much richer, more linguistically motivated, background knowledge has been supplied, consisting of examples of noun phrases, verb chains, auxiliary verbs, and sets of part of speech categories. The aim has been to create the background knowledge rapidly, without laborious hand-coding of linguistic knowledge. In addition to the new background knowledge, new, more expressive rule types have been induced for two part of speech categories and compared to the corresponding rules of the previous bottom-line experiment. The new rules perform considerably better, with a recall of 99.4% for the new rules, compared to 97.6% for the old rules. Precision was slightly better for the new rules.
引用
收藏
页码:186 / 197
页数:12
相关论文
共 50 条
  • [31] A Method for Disambiguation of Part-of-Speech Homonymy Based on Application of Syntactic Compatibility in the Russian Language
    Klyshinsky, E. S.
    Kochetkova, N. A.
    Litvinov, M. I.
    Maksimov, V. Yu.
    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS, 2011, 45 (01) : 15 - 19
  • [32] IMPROVING SENTIMENT ANALYSIS WITH PART-OF-SPEECH WEIGHTING
    Nicholls, Chris
    Song, Fei
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1592 - 1597
  • [33] PosWSD: Low-Resource Word Sense Disambiguation Model using Part Of Speech Information
    Chen, Yazhen
    Zhang, Jian
    He, Qipeng
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 26 - 31
  • [34] Part-of-speech tagging and PP attachment disambiguation using a boosted maximum entropy model
    Park, SB
    Jangmin, O
    Lee, SJ
    PRICAI 2004: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3157 : 930 - 931
  • [35] Unsupervised Part-of-Speech Disambiguation for High Frequency Words and Its Influence on Unsupervised Parsing
    Haenig, Christian
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2010, 6008 : 113 - 120
  • [36] Improving Thai Word and Sentence Segmentation Using Linguistic Knowledge
    Nararatwong, Rungsiman
    Kertkeidkachorn, Natthawut
    Cooharojananone, Nagul
    Okada, Hitoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (12): : 3218 - 3225
  • [37] Improving statistical machine translation using shallow linguistic knowledge
    Hwang, Young-Sook
    Finch, Andrew
    Sasaki, Yutaka
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 350 - 372
  • [38] Adding part-of-speech information to the SUBTLEX-US word frequencies
    Brysbaert, Marc
    New, Boris
    Keuleers, Emmanuel
    BEHAVIOR RESEARCH METHODS, 2012, 44 (04) : 991 - 997
  • [39] Adding part-of-speech information to the SUBTLEX-US word frequencies
    Marc Brysbaert
    Boris New
    Emmanuel Keuleers
    Behavior Research Methods, 2012, 44 : 991 - 997
  • [40] A SPEECH UNDERSTANDING AND DIALOG SYSTEM WITH A HOMOGENEOUS LINGUISTIC KNOWLEDGE-BASE
    MAST, M
    KUMMERT, F
    EHRLICH, U
    FINK, GA
    KUHN, T
    NIEMANN, H
    SAGERER, G
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1994, 16 (02) : 179 - 194