Learning Character-level Representations for Part-of-Speech Tagging

被引:0
|
作者
dos Santos, Cicero Nogueira [1 ]
Zadrozny, Bianca [1 ]
机构
[1] IBM Res Brazil, Av Pasteur 138-146, BR-22296903 Rio De Janeiro, Brazil
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce stateof-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Tree-bank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.
引用
收藏
页码:1818 / 1826
页数:9
相关论文
共 50 条
  • [21] Part-of-speech tagging without training
    Bressan, S
    Indradjaja, LS
    INTELLIGENCE IN COMMUNICATION SYSTEMS, 2004, 3283 : 112 - 119
  • [22] Corpus based part-of-speech tagging
    Lv, Chengyao
    Liu, Huihua
    Dong, Yuanxing
    Chen, Yunliang
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (03) : 647 - 654
  • [23] The Application of CRFs in Part-of-Speech Tagging
    Zhang Xiaofei
    Huang Heyan
    Zhang Liang
    2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 347 - +
  • [24] A CONNECTIONIST APPROACH TO PART-OF-SPEECH TAGGING
    Zamora-Martinez, F.
    Castro-Bleda, M. J.
    Espana-Boquera, S.
    Tortajada, Salvador
    Aibar, P.
    IJCCI 2009: PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2009, : 421 - +
  • [25] Part-of-speech tagging and partial parsing
    Abney, S
    CORPUS-BASED METHODS IN LANGUAGE AND SPEECH PROCESSING, 1997, 2 : 118 - 136
  • [26] Semi-Supervised Learning for Part-of-Speech Tagging of Mandarin Transcribed Speech
    Wang, Wen
    Huang, Zhongqiang
    Harper, Mary
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 137 - +
  • [27] Portuguese Part-of-Speech Tagging Using Entropy Guided Transformation Learning
    dos Santos, Cicero Nogueira
    Milidiu, Ruy L.
    Renteria, Raul P.
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROCEEDINGS, 2008, 5190 : 143 - +
  • [28] Deep Learning Architecture for Part-of-Speech Tagging with Word and Suffix Embeddings
    Popov, Alexander
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2016, 2016, 9883 : 68 - 77
  • [29] Using machine learning techniques for part-of-speech tagging in the Greek language
    Petasis, G
    Paliouras, G
    Karkaletsis, V
    Spyropoulos, CD
    Androutsopoulos, I
    ADVANCES IN INFORMATICS, 2000, : 273 - 281
  • [30] Part-of-speech tagging with recurrent neural networks
    Pérez-Ortiz, JA
    Forcada, ML
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1588 - 1592