Headword-Oriented Entity Linking: A Special Entity Linking Task with Dataset and Baseline

被引:0
|
作者
Yang, Mu [1 ]
Chen, Chi-Yen [1 ]
Lee, Yi-Hui [1 ]
Zeng, Qian-Hui [1 ]
Ma, Wei-Yun [1 ]
Shih, Chen-Yang [2 ]
Chen, Wei-Jhih [2 ]
机构
[1] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
[2] PIXNET Corp, R&D Ctr, Taipei, Taiwan
关键词
Corpus; Information Extraction; Distant Supervision;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we design headword-oriented entity linking (HEL), a specialized entity linking problem in which only the headwords of the entities are to be linked to knowledge bases; mention scopes of the entities do not need to be identified in the problem setting. This special task is motivated by the fact that in many articles referring to specific products, the complete full product names are rarely written; instead, they are often abbreviated to shorter, irregular versions or even just to their headwords, which are usually their product types, such as "stick" or "mask" in a cosmetic context. To fully design the special task, we construct a labeled cosmetic corpus as a public benchmark for this problem, and propose a product embedding model to address the task, where each product corresponds to a dense representation to encode the different information on products and their context jointly. Besides, to increase training data, we propose a special transfer learning framework in which distant supervision with heuristic patterns is first utilized, followed by supervised learning using a small amount of manually labeled data. The experimental results show that our model provides a strong benchmark performance on the special task.
引用
收藏
页码:1910 / 1917
页数:8
相关论文
共 50 条
  • [1] Multimodal Entity Linking: A New Dataset and A Baseline
    Gan, Jingru
    Luo, Jinchang
    Wang, Haiwei
    Wang, Shuhui
    He, Wei
    Huang, Qingming
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 993 - 1001
  • [2] Reddit entity linking dataset
    Botzer, Nicholas
    Ding, Yifan
    Weninger, Tim
    [J]. Weninger, Tim (tweninger@nd.edu), 1600, Elsevier Ltd (58):
  • [3] Reddit entity linking dataset
    Botzer, Nicholas
    Ding, Yifan
    Weninger, Tim
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (03)
  • [4] VoxEL: A Benchmark Dataset for Multilingual Entity Linking
    Rosales-Mendez, Henry
    Hogan, Aidan
    Poblete, Barbara
    [J]. SEMANTIC WEB - ISWC 2018, PT II, 2018, 11137 : 170 - 186
  • [5] NILK: Entity Linking Dataset Targeting NIL-Linking Cases
    Iurshina, Anastasiia
    Pan, Jiaxin
    Boutalbi, Rafika
    Staab, Steffen
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4069 - 4073
  • [6] WIKIDiverse: A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity Types
    Wang, Xuwu
    Tian, Junfeng
    Gui, Min
    Li, Zhixu
    Wang, Rui
    Yang, Ming
    Chen, Lihan
    Xiao, Yanghua
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4785 - 4797
  • [7] Building a Multimodal Entity Linking Dataset From Tweets
    Adjali, Omar
    Besancon, Romaric
    Ferret, Olivier
    Le Borgne, Herve
    Grau, Brigitte
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4285 - 4292
  • [8] EAL: A Toolkit and Dataset for Entity-Aspect Linking
    Nanni, Federico
    Zhang, Jingyi
    Betz, Ferdinand
    Gashteovski, Kiril
    [J]. 2019 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2019), 2019, : 430 - 431
  • [9] A Multilingual Dataset for Named Entity Recognition, Entity Linking and Stance Detection in Historical Newspapers
    Hamdi, Ahmed
    Pontes, Elvys Linhares
    Boros, Emanuela
    Thi Tuyet Hai Nguyen
    Hackl, Guenter
    Moreno, Jose G.
    Doucet, Antoine
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2328 - 2334
  • [10] Entity Linking and Retrieval
    Meij, Edgar
    Balog, Krisztian
    Odijk, Daan
    [J]. SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, 2013, : 1127 - 1127