Implementing an efficient part-of-speech tagger

被引:0
|
作者
Carlberger, J [1 ]
Kann, V [1 ]
机构
[1] Royal Inst Technol, SE-10044 Stockholm, Sweden
来源
SOFTWARE-PRACTICE & EXPERIENCE | 1999年 / 29卷 / 09期
关键词
part-of-speech tagging; word tagging; optimization; hidden Markov models;
D O I
10.1002/(SICI)1097-024X(19990725)29:9<815::AID-SPE256>3.0.CO;2-F
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
An efficient implementation of a part-of-speech tagger for Swedish is described. The stochastic tagger uses a well-established Markov model of the language. The tagger tags 92 per cent of unknown words correctly and up to 97 per cent of all words. Several implementation and optimization considerations are discussed. The main contribution of this paper is the thorough description of the tagging algorithm and the addition of a number of improvements, The paper contains enough detail for the reader to construct a tagger for his own language. Copyright (C) 1999 John Wiley & Sons, Ltd.
引用
收藏
页码:815 / 832
页数:18
相关论文
共 50 条
  • [1] An Efficient Part-of-Speech Tagger for Arabic
    Kopru, Selcuk
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PT I, 2011, 6608 : 202 - 213
  • [2] An Accurate Persian Part-of-Speech Tagger
    Okhovvat, Morteza
    Sharifi, Mohsen
    Bidgoli, Behrouz Minaei
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 423 - 430
  • [3] A Practical Part-of-Speech Tagger for Bengali
    Sarkar, Kamal
    Gayen, Vivekananda
    [J]. 2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 36 - 40
  • [4] TnT - A statistical part-of-speech tagger
    Brants, T
    [J]. 6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : 224 - 231
  • [5] MedPost: a part-of-speech tagger for bioMedical text
    Smith, L
    Rindflesch, T
    Wilbur, WJ
    [J]. BIOINFORMATICS, 2004, 20 (14) : 2320 - 2321
  • [6] Tamil Part-of-Speech tagger based on SVMTool
    Dhanalakshmi, V
    Anandkumar, M.
    Vijaya, M. S.
    Loganathan, R.
    Soman, K. P.
    Rajendran, S.
    [J]. RECENT ADVANCES OF ASIAN LANGUAGE PROCESSING TECHNOLOGIES, 2008, : 59 - +
  • [7] Toward an Effective Igbo Part-of-Speech Tagger
    Onyenwe, Ikechukwu E.
    Hepple, Mark
    Chinedu, Uchechukwu
    Ezeani, Ignatius
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)
  • [8] A suffix based part-of-speech tagger for Turkish
    Dincer, Taner
    Karaoglan, Bahar
    Kisla, Tarik
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2008, : 680 - +
  • [9] Developing a Part-Of-Speech tagger for te reo Maori
    Finn, Aoife
    Jones, Peter-Lucas
    Mahelona, Keoni
    Duncan, Suzanne
    Leoni, Gianna
    [J]. PROCEEDINGS OF THE FIFTH WORKSHOP ON THE USE OF COMPUTATIONAL METHODS IN THE STUDY OF ENDANGERED LANGUAGES (COMPUTEL-5 2022), 2022, : 93 - 98
  • [10] An automatic part-of-speech tagger for Middle Low German
    Koleva, Mariya
    Farasyn, Melissa
    Desmet, Bart
    Breitbarth, Anne
    Hoste, Veronique
    [J]. INTERNATIONAL JOURNAL OF CORPUS LINGUISTICS, 2017, 22 (01) : 107 - 140