Why pay more? A simple and efficient named entity recognition system for tweets

被引:13
|
作者
Suman, Chanchal [1 ]
Reddy, Saichethan Miriyala [2 ]
Saha, Sriparna [1 ]
Bhattacharyya, Pushpak [1 ]
机构
[1] Indian Inst Technol Patna, Comp Sci & Engn, Patna, Bihar, India
[2] IIIT Bhagalpur, Comp Sci & Engn, Bhagalpur, India
关键词
Conditional random field; Hand-crafted feature; Long short term memory; Named entity recognition; Sequence labelling;
D O I
10.1016/j.eswa.2020.114101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The current paper investigates the problem of multimodal named entity recognition from Twitter data. Named entity recognition (NER) is an important task in natural language processing and has been carefully studied in recent decades. NER from tweets is particularly challenging because of 1) tweets are limited in length, 2) contains noisy text, and 3) contains hashtags. Moreover often tweets are associated with images and hyperlinks. Existing works on tweet-NER mostly concentrate on multimodal deep learning based models neglecting the use of hand-crafted features and usage of hyperlinks. The current paper investigates the incorporation of hand-crafted features extracted from different modalities like images, hyperlinks while extracting named entities from tweet-text. A large set of hand-crafted features are extracted from different modalities (images, hyperlinks) and those are added with the features extracted by a hybrid deep-neural model, bi-directional LSTM and CNN, followed by a conditional random field to perform this task. Several variants of these models in association with different hand-crafted feature sets are designed. Extensive experimentations on a multimodal Twitter data (containing text, images and urls) illustrate that character level hand-crafted features significantly improve the performance of the systems. In a part of the paper, results of the proposed models are also shown on a standard NER dataset, CoNLL 2003 dataset.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] A LANGUAGE INDEPENDENT NAMED ENTITY RECOGNITION SYSTEM
    Gifu, Daniela
    Vasilache, Gabriela
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2014, 2014, : 181 - 188
  • [22] Named Entity Recognition System for the Biomedical Domain
    Sharma, Raghav
    Chauhan, Deependra
    Sharma, Raksha
    [J]. PROCEEDINGS OF THE 2022 17TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2022, : 837 - 840
  • [23] A more cost-efficient Chinese Named Entity Recognition based on trigger and matching network
    Zhang, Yun
    Zhang, Yude
    Yu, Shujuan
    Wang, Xiumei
    Zhao, Shengmei
    Wang, Weigang
    Liu, Yan
    Ding, Keke
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2085 - 2096
  • [24] Named-Entity Recognition on Indonesian Tweets using Bidirectional LSTM-CRF
    Wintaka, Deni Cahya
    Bijaksana, Moch Arif
    Asror, Ibnu
    [J]. 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE (ICCSCI 2019) : ENABLING COLLABORATION TO ESCALATE IMPACT OF RESEARCH RESULTS FOR SOCIETY, 2019, 157 : 221 - 228
  • [25] Extending Hybrid Conditional Random Fields Approach of Named Entity Recognition for Marathi Tweets
    Patawar, Maithilee L.
    Potey, M. A.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2016,
  • [26] Improving Named Entity Recognition in Tweets via Detecting Non-Standard Words
    Li, Chen
    Liu, Yang
    [J]. PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, 2015, : 929 - 938
  • [27] Simple Effective Microblog Named Entity Recognition: Arabic as an Example
    Darwish, Kareem
    Gao, Wei
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2513 - 2517
  • [28] Efficient combined approach for named entity recognition in spoken language
    Zidouni, Azeddine
    Rosset, Sophie
    Glotin, Herve
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1293 - +
  • [29] Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers
    Lester, Brian
    Pressel, Daniel
    Hemmeter, Amy
    Choudhury, Sagnik Ray
    Bangalore, Srinivas
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1841 - 1848
  • [30] A Chinese Named Entity Recognition System with Neural Networks
    Yi, Hui-Kang
    Huang, Jiu-Ming
    Yang, Shu-Qiang
    [J]. 4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12