Portuguese Part-of-Speech Tagging with Large Margin Structure Learning

被引:1
|
作者
Fernandes, Eraldo R. [1 ]
Rodrigues, Irving M. [1 ]
Milidiu, Ruy L. [2 ]
机构
[1] FACOM UFMS, Campo Grande, Brazil
[2] DU PUC Rio, Rio De Janeiro, Brazil
关键词
D O I
10.1109/BRACIS.2014.16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Part-of-Speech Tagging is a fundamental task on many Natural Language Processing systems. This task consists in identifying the syntactic category, i.e. the part of speech, of each word in a sentence. Despite the fact that the current state-of-the-art accuracy for this task is around 97%, any improvement has an immediate impact on more complex tasks, like Parsing, Semantic Role Labeling and Information Extraction. Thus, it is still relevant to explore this task. In this paper, we introduce a part-of-speech tagger based on the Structure Learning framework that reduces the smallest known error on the Portuguese Mac-Morpho corpus by 7.8%. We also apply our tagger to a recently revised version of Mac-Morpho. Our system accuracy on this latter version is competitive with a semi-supervised Neural Network trained on Mac-Morpho plus a very large non-annotated corpus. Additionally, our system is simpler than previous systems and uses a very limited feature set. Our system employs a Large Margin training criteria to derive a structure predictor that is more robust on unseen data.
引用
收藏
页码:25 / 30
页数:6
相关论文
共 50 条
  • [41] Phrase-based part-of-speech tagging
    Finch, Andrew
    Sumita, Eiichiro
    [J]. PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 215 - +
  • [42] Semi-supervised Part-of-speech Tagging in Speech Applications
    Dufour, Richard
    Favre, Benoit
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1373 - 1376
  • [43] Part-of-speech tagging of building codes empowered by deep learning and transformational rules
    Xue, Xiaorui
    Zhang, Jiansong
    [J]. ADVANCED ENGINEERING INFORMATICS, 2021, 47
  • [44] Analysing part-of-speech for Portuguese text classification
    Gonçalves, T
    Silva, C
    Quaresma, P
    Vieira, R
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2006, 3878 : 551 - 562
  • [45] Performance Evaluation of Part-of-Speech Tagging for Bengali Text
    Pan S.
    Saha D.
    [J]. Journal of The Institution of Engineers (India): Series B, 2022, 103 (02) : 577 - 589
  • [46] Part-of-Speech Tagging with Both Character and Word Information
    Zhou, You
    Liu, Fangzhou
    [J]. Proceedings of the 2nd International Conference on Electronics, Network and Computer Engineering (ICENCE 2016), 2016, 67 : 945 - 948
  • [47] Multilingual part-of-speech tagging with weightless neural networks
    Carneiro, Hugo C. C.
    Franca, Felipe M. G.
    Lima, Priscila M. V.
    [J]. NEURAL NETWORKS, 2015, 66 : 11 - 21
  • [48] The Transformer Neural Network Architecture for Part-of-Speech Tagging
    Maksutov, Artem A.
    Zamyatovskiy, Vladimir, I
    Morozov, Viacheslav O.
    Dmitriev, Sviatoslav O.
    [J]. PROCEEDINGS OF THE 2021 IEEE CONFERENCE OF RUSSIAN YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING (ELCONRUS), 2021, : 536 - 540
  • [49] The effect of part-of-speech tagging on IR performance for Turkish
    Dinçer, BT
    Karaoglan, B
    [J]. COMPUTER AND INFORMATION SCIENCES - ISCIS 2004, PROCEEDINGS, 2004, 3280 : 771 - 778
  • [50] Morphological Segmentation and Part-of-Speech Tagging for the Arabic Heritage
    Mohamed, Emad
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2018, 17 (03)