Learning Character-level Representations for Part-of-Speech Tagging

被引:0
|
作者
dos Santos, Cicero Nogueira [1 ]
Zadrozny, Bianca [1 ]
机构
[1] IBM Res Brazil, Av Pasteur 138-146, BR-22296903 Rio De Janeiro, Brazil
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2) | 2014年 / 32卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce stateof-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Tree-bank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.
引用
收藏
页码:1818 / 1826
页数:9
相关论文
共 50 条
  • [31] Bilingual Lexicon Induction by Learning to CombineWord-Level and Character-Level Representations
    Heyman, Geert
    Vulic, Ivan
    Moens, Marie-Francine
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 1085 - 1095
  • [32] Impact of imperfect OCR on part-of-speech tagging
    Lin, XF
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 284 - 288
  • [33] Analyzing Tagging Accuracy of Part-of-Speech Taggers
    Khin, Nyein Pyae Pyae
    Aung, Than Nwe
    GENETIC AND EVOLUTIONARY COMPUTING, VOL II, 2016, 388 : 347 - 354
  • [34] High performance part-of-speech tagging of Bulgarian
    Doychinova, V
    Mihov, S
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, 2004, 3192 : 246 - 255
  • [35] Dual Decomposition for Vietnamese Part-of-Speech Tagging
    Bach, Ngo Xuan
    Hiraishi, Kunihiko
    Le Minh, Nguyen
    Shimazu, Akira
    17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 123 - 131
  • [36] Part-of-speech tagging using genetic algorithms
    Department of Computer Science and Engineering, Lovely Professional University, Jalandhar
    Punjab, India
    Int. J. Simul. Syst. Sci. Technol., 6 (11.1-11.7):
  • [37] Part-of-Speech (POS) Tagging for the Nyishi Language
    Siram, Joyir
    Sambyo, Koj
    Sarkar, Achyuth
    ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY AND COMPUTING, AICTC 2021, 2022, 392 : 191 - 199
  • [38] On Certain Aspects of Kazakh Part-of-Speech Tagging
    Makazhanov, Aibek
    Yessenbayev, Zhandos
    Sabyrgaliyev, Islam
    Sharafudinov, Anuar
    Makhambetov, Olzhas
    2014 IEEE 8TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2014, : 240 - 243
  • [39] Part-of-Speech Tagging Using Evolutionary Computation
    Silva, Ana Paula
    Silva, Arlindo
    Rodrigues, Irene
    NATURE INSPIRED COOPERATIVE STRATEGIES FOR OPTIMIZATION (NICSO 2013), 2014, 512 : 167 - +
  • [40] Part-of-speech tagging for table of contents recognition
    Belaïd, A
    Pierron, L
    Valverde, N
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 451 - 454