Building a Multimodal Entity Linking Dataset From Tweets

被引:0
|
作者
Adjali, Omar [1 ]
Besancon, Romaric [1 ]
Ferret, Olivier [1 ]
Le Borgne, Herve [1 ]
Grau, Brigitte [2 ]
机构
[1] CEA, LIST, F-91191 Gif Sur Yvette, France
[2] Univ Paris Saclay, LIMSI, CNRS, F-91405 Orsay, France
关键词
Entity linking; social media; multimodality; multimedia entity linking;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The task of Entity linking, which aims at associating an entity mention with a unique entity in a knowledge base (KB), is useful for advanced Information Extraction tasks such as relation extraction or event detection. Most of the studies that address this problem rely only on textual documents while an increasing number of sources are multimedia, in particular in the context of social media where messages are often illustrated with images. In this article, we address the Multimodal Entity Linking (MEL) task, and more particularly the problem of its evaluation. To this end, we propose a novel method to quasi-automatically build annotated datasets to evaluate methods on the MEL task. The method collects text and images to jointly build a corpus of tweets with ambiguous mentions along with a Twitter KB defining the entities. We release a new annotated dataset of Twitter posts associated with images. We study the key characteristics of the proposed dataset and evaluate the performance of several MEL approaches on it.
引用
收藏
页码:4285 / 4292
页数:8
相关论文
共 50 条
  • [41] Summarization of Tweets and Named Entity Recognition from Tweet Segmentation
    Chavan, Chetan
    Suryawanshi, Ranjeetsingh
    2016 INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND DYNAMIC OPTIMIZATION TECHNIQUES (ICACDOT), 2016, : 66 - 71
  • [42] Attention-Based Multimodal Entity Linking with High-Quality Images
    Zhang, Li
    Li, Zhixu
    Yang, Qiang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 533 - 548
  • [43] Multimodal Analysis of Disaster Tweets
    Gautam, Akash Kumar
    Kumar, Ajit
    Aggarwal, Shashwat
    Misra, Luv
    Misra, Kush
    Shah, Rajiv Ratn
    2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2019), 2019, : 94 - 103
  • [44] WeDGeM: A Domain-Specific Evaluation Dataset Generator for Multilingual Entity Linking Systems
    Inan, Emrah
    Dikenelli, Oguz
    WEB INFORMATION SYSTEMS ENGINEERING, WISE 2017, PT II, 2017, 10570 : 221 - 228
  • [45] Named Entity Recognition in Vietnamese Tweets
    Nguyen, Vu H.
    Nguyen, Hien T.
    Snasel, Vaclav
    COMPUTATIONAL SOCIAL NETWORKS, CSONET 2015, 2015, 9197 : 205 - 215
  • [46] Building and Testing Fine-Grained Dataset of COVID-19 Tweets for Worry Prediction
    Alharbi, Tahani Soud
    Fkih, Fethi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 645 - 652
  • [47] MESA: A Multimodal Entity Entailment framework for multimodal Entity Alignment
    Zhao, Yu
    Zhang, Ying
    Sui, Xuhui
    Cai, Xiangrui
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
  • [48] Named Entity Recognition on Turkish Tweets
    Kuecuek, Dilek
    Jacquet, Guillaume
    Steinberger, Ralf
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 450 - 454
  • [49] Dataset of Arabic spam and ham tweets
    Kaddoura, Sanaa
    Henno, Safaa
    DATA IN BRIEF, 2024, 52
  • [50] From rhetoric to record: linking tweets to legislative agendas in congress
    Russell, Annelise
    Wen, Jiebing
    JOURNAL OF LEGISLATIVE STUDIES, 2021, 27 (04): : 608 - 620