Why pay more? A simple and efficient named entity recognition system for tweets

被引:13
|
作者
Suman, Chanchal [1 ]
Reddy, Saichethan Miriyala [2 ]
Saha, Sriparna [1 ]
Bhattacharyya, Pushpak [1 ]
机构
[1] Indian Inst Technol Patna, Comp Sci & Engn, Patna, Bihar, India
[2] IIIT Bhagalpur, Comp Sci & Engn, Bhagalpur, India
关键词
Conditional random field; Hand-crafted feature; Long short term memory; Named entity recognition; Sequence labelling;
D O I
10.1016/j.eswa.2020.114101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The current paper investigates the problem of multimodal named entity recognition from Twitter data. Named entity recognition (NER) is an important task in natural language processing and has been carefully studied in recent decades. NER from tweets is particularly challenging because of 1) tweets are limited in length, 2) contains noisy text, and 3) contains hashtags. Moreover often tweets are associated with images and hyperlinks. Existing works on tweet-NER mostly concentrate on multimodal deep learning based models neglecting the use of hand-crafted features and usage of hyperlinks. The current paper investigates the incorporation of hand-crafted features extracted from different modalities like images, hyperlinks while extracting named entities from tweet-text. A large set of hand-crafted features are extracted from different modalities (images, hyperlinks) and those are added with the features extracted by a hybrid deep-neural model, bi-directional LSTM and CNN, followed by a conditional random field to perform this task. Several variants of these models in association with different hand-crafted feature sets are designed. Extensive experimentations on a multimodal Twitter data (containing text, images and urls) illustrate that character level hand-crafted features significantly improve the performance of the systems. In a part of the paper, results of the proposed models are also shown on a standard NER dataset, CoNLL 2003 dataset.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Named Entity Recognition for Tweets
    Liu, Xiaohua
    Wei, Furu
    Zhang, Shaodian
    Zhou, Ming
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2013, 4 (01)
  • [2] Named Entity Recognition in Vietnamese Tweets
    Nguyen, Vu H.
    Nguyen, Hien T.
    Snasel, Vaclav
    [J]. COMPUTATIONAL SOCIAL NETWORKS, CSONET 2015, 2015, 9197 : 205 - 215
  • [3] Named Entity Recognition on Turkish Tweets
    Kuecuek, Dilek
    Jacquet, Guillaume
    Steinberger, Ralf
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 450 - 454
  • [4] Analysis of named entity recognition and linking for tweets
    Derczynski, Leon
    Maynard, Diana
    Rizzo, Giuseppe
    van Erp, Marieke
    Gorrell, Genevieve
    Troncy, Raphael
    Petrak, Johann
    Bontcheva, Kalina
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2015, 51 (02) : 32 - 49
  • [5] Named Entity System for Tweets in Hindi Language
    Jain, Arti
    Arora, Anuja
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2018, 14 (04) : 55 - 76
  • [6] A Feature Based Simple Machine Learning Approach with Word Embeddings to Named Entity Recognition on Tweets
    Taspinar, Mete
    Ganiz, Murat Can
    Acarman, Tankut
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 254 - 259
  • [7] Summarization of Tweets and Named Entity Recognition from Tweet Segmentation
    Chavan, Chetan
    Suryawanshi, Ranjeetsingh
    [J]. 2016 INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND DYNAMIC OPTIMIZATION TECHNIQUES (ICACDOT), 2016, : 66 - 71
  • [8] Named Entity Recognition and Normalization in Tweets Towards Text Summarization
    Jabeen, Saima
    Shah, Sajid
    Latif, Asma
    [J]. 2013 EIGHTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2013, : 223 - 227
  • [9] Named Entity Recognition and Linking in Tweets Based on Linguistic Similarity
    Pipitone, Arianna
    Tirone, Giuseppe
    Pirrone, Roberto
    [J]. AI*IA 2017 ADVANCES IN ARTIFICIAL INTELLIGENCE, 2017, 10640 : 101 - 113
  • [10] Adaptive Co-Attention Network for Named Entity Recognition in Tweets
    Zhang, Qi
    Fu, Jinlan
    Liu, Xiaoyu
    Huang, Xuanjing
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5674 - 5681