Learning Character-level Representations for Part-of-Speech Tagging

被引:0
|
作者
dos Santos, Cicero Nogueira [1 ]
Zadrozny, Bianca [1 ]
机构
[1] IBM Res Brazil, Av Pasteur 138-146, BR-22296903 Rio De Janeiro, Brazil
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce stateof-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Tree-bank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.
引用
收藏
页码:1818 / 1826
页数:9
相关论文
共 50 条
  • [1] Transfer learning based code-mixed part-of-speech tagging using character level representations for Indian languages
    Anand Kumar Madasamy
    Soman Kutti Padannayil
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 7207 - 7218
  • [2] Transfer learning based code-mixed part-of-speech tagging using character level representations for Indian languages
    Madasamy, Anand Kumar
    Padannayil, Soman Kutti
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (6) : 7207 - 7218
  • [3] Part-of-Speech Tagging with Both Character and Word Information
    Zhou, You
    Liu, Fangzhou
    Proceedings of the 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 2016), 2016, 67 : 945 - 948
  • [4] Part-of-speech tagging
    Martinez, Angel R.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2012, 4 (01): : 107 - 113
  • [5] Part-of-Speech Tagging Using Multiview Learning
    Lim, Kyungtae
    Park, Jungyeul
    IEEE ACCESS, 2020, 8 : 195184 - 195196
  • [6] Improving Part-of-Speech Tagging by Meta-learning
    Kobylinski, Lukasz
    Wasiluk, Michal
    Wojdyga, Grzegorz
    TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 144 - 152
  • [7] Part-of-speech tagging for Swedish
    Prütz, K
    PARALLEL CORPORA, PARALLEL WORLDS, 2002, (43): : 201 - 206
  • [8] Reducing Confusion in Active Learning for Part-Of-Speech Tagging
    Chaudhary, Aditi
    Anastasopoulos, Antonios
    Sheikh, Zaid
    Neubig, Graham
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 1 - 16
  • [9] Revision learning and its application to part-of-speech tagging
    Nakagawa, T
    Kudo, T
    Matsumoto, Y
    40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 497 - 504
  • [10] Deep Learning Model for Tamil Part-of-Speech Tagging
    Visuwalingam, Hemakasiny
    Sakuntharaj, Ratnasingam
    Alawatugoda, Janaka
    Ragel, Roshan
    COMPUTER JOURNAL, 2024, 67 (08): : 2633 - 2642