Building a Multimodal Entity Linking Dataset From Tweets

被引:0
|
作者
Adjali, Omar [1 ]
Besancon, Romaric [1 ]
Ferret, Olivier [1 ]
Le Borgne, Herve [1 ]
Grau, Brigitte [2 ]
机构
[1] CEA, LIST, F-91191 Gif Sur Yvette, France
[2] Univ Paris Saclay, LIMSI, CNRS, F-91405 Orsay, France
关键词
Entity linking; social media; multimodality; multimedia entity linking;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The task of Entity linking, which aims at associating an entity mention with a unique entity in a knowledge base (KB), is useful for advanced Information Extraction tasks such as relation extraction or event detection. Most of the studies that address this problem rely only on textual documents while an increasing number of sources are multimedia, in particular in the context of social media where messages are often illustrated with images. In this article, we address the Multimodal Entity Linking (MEL) task, and more particularly the problem of its evaluation. To this end, we propose a novel method to quasi-automatically build annotated datasets to evaluate methods on the MEL task. The method collects text and images to jointly build a corpus of tweets with ambiguous mentions along with a Twitter KB defining the entities. We release a new annotated dataset of Twitter posts associated with images. We study the key characteristics of the proposed dataset and evaluate the performance of several MEL approaches on it.
引用
收藏
页码:4285 / 4292
页数:8
相关论文
共 50 条
  • [31] Linking News and Tweets
    Lin, Xiaojie
    Gu, Ye
    Zhang, Rui
    Fan, Ju
    DATABASES THEORY AND APPLICATIONS, (ADC 2016), 2016, 9877 : 467 - 470
  • [32] Linking Obesity and Tweets
    Anwar, Mohd
    Yuan, Zhuoning
    SMART HEALTH, ICSH 2015, 2016, 9545 : 254 - 266
  • [33] MORE: A Multimodal Object-Entity Relation Extraction Dataset with a Benchmark Evaluation
    He, Liang
    Wang, Hongke
    Cao, Yongchang
    Wu, Zhen
    Zhang, Jianbing
    Dai, Xinyu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4564 - 4573
  • [34] Named Entity Recognition for Tweets
    Liu, Xiaohua
    Wei, Furu
    Zhang, Shaodian
    Zhou, Ming
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2013, 4 (01)
  • [35] ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain
    Oramas, Sergio
    Espinosa-Anke, Luis
    Sordo, Mohamed
    Saggion, Horacio
    Serra, Xavier
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3312 - 3317
  • [36] Inferring Climate Change Stances from Multimodal Tweets
    Bai, Nan
    Torres, Ricardo da Silva
    Fensel, Anna
    Metze, Tamara
    Dewulf, Art
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2467 - 2471
  • [37] The ChatGPT and Education Tweets Dataset
    Barandoni, Simone
    Chiarello, Filippo
    Giordano, Vito
    Fantoni, Gualtiero
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II, 2025, 2134 : 18 - 32
  • [38] A Dataset for Detecting Stance in Tweets
    Mohammad, Saif M.
    Kiritchenko, Svetlana
    Sobhani, Parinaz
    Zhu, Xiaodan
    Cherry, Colin
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3945 - 3952
  • [39] METS-CoV: A Dataset of Medical Entity and Targeted Sentiment on COVID-19 Related Tweets
    Zhou, Peilin
    Wang, Zeqiang
    Chong, Dading
    Guo, Zhijiang
    Hua, Yining
    Su, Zichang
    Teng, Zhiyang
    Wu, Jiageng
    Yang, Jie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [40] Entity-based Opinion Mining from Spanish Tweets
    Paniagua-Reyes, Fabian
    Reyes-Ortiz, Jose A.
    Bravo, Maricela
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2017, : 400 - 407