The Five Generations of Entity Resolution on Web Data

被引:0
|
作者
Nikoletos, Konstantinos [1 ]
Ioannou, Ekaterini [2 ]
Papadakis, George [1 ]
机构
[1] Univ Athens, Athens, Greece
[2] Tilburg Univ, Tilburg, Netherlands
来源
WEB ENGINEERING, ICWE 2024 | 2024年 / 14629卷
关键词
Entity Resolution; Data Integration; LLMs;
D O I
10.1007/978-3-031-62362-2_46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Entity Resolution constitutes a core data integration task that has attracted a bulk of works on improving its effectiveness and time efficiency. This tutorial provides a comprehensive overview of the field, distinguishing relevant methods into five main generations. The first one targets Veracity in the context of structured data with a clean schema. The second generation extends its focus to cover Volume, as well, leveraging multi-core or massive parallelization to process large-scale datasets. The third generation addresses the additional challenge of Variety, targeting voluminous, noisy, semi-structured, and highly heterogeneous data from the Semantic Web. The fourth generation also tackles Velocity so as to process data collections of a continuously increasing volume. The latest works, though, belong to the fifth generation, involving pre-trained (large) language models which heavily rely on external knowledge to address all four Vs with high effectiveness.
引用
下载
收藏
页码:469 / 473
页数:5
相关论文
共 50 条
  • [21] Tutorial: Uncertain Entity Resolution Re-evaluating Entity Resolution in the Big Data Era
    Gal, Avigdor
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (13): : 1711 - 1712
  • [22] An Ontology-Based Approach for Product Entity Resolution on the Web
    Vermaas, Raymond
    Vandic, Damir
    Frasincar, Flavius
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2014, PT I, 2014, 8786 : 534 - 543
  • [23] EdgER: Entity Resolution at the Edge for Next Generation Web Systems
    Martella, Cristian
    Martella, Angelo
    Longo, Antonella
    WEB ENGINEERING, ICWE 2024, 2024, 14629 : 178 - 196
  • [24] An ontology-based approach for product entity resolution on the web
    Vermaas, Raymond
    Vandic, Damir
    Frasincar, Flavius
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8786 : 534 - 543
  • [25] Towards better entity resolution techniques for Web document collections
    Yerva, Surender Reddy
    Miklos, Zoltan
    Aberer, Karl
    2010 IEEE 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDE 2010), 2010, : 209 - 214
  • [26] Web-scale Blocking, Iterative and Progressive Entity Resolution
    Stefanidis, Kostas
    Christophides, Vassilis
    Efthymiou, Vasilis
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 1459 - 1462
  • [27] Data Augmentation for Entity Resolution: A comparative evaluation
    Rettenmeier, Tobias
    Jesser, Alexander
    2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
  • [28] Recommending Spatial Classes for Entity Interlinking in the Web of Data
    Kopsachilis, Vasilis
    SEMANTIC WEB: ESWC 2018 SATELLITE EVENTS, 2018, 11155 : 225 - 239
  • [29] TRank: Ranking Entity Types Using the Web of Data
    Tonon, Alberto
    Catasta, Michele
    Demartini, Gianluca
    Mauroux, Philippe Cudre
    Aberer, Karl
    SEMANTIC WEB - ISWC 2013, PART I, 2013, 8218 : 640 - 656
  • [30] Exploiting Attribute Redundancy for Web Entity Data Extraction
    Zhu, Yanxu
    Yin, Gang
    Li, Xiang
    Wang, Huaimin
    Shi, Dianxi
    Yuan, Lin
    DIGITAL LIBRARIES: FOR CULTURAL HERITAGE, KNOWLEDGE DISSEMINATION, AND FUTURE CREATION: ICADL 2011, 2011, 7008 : 98 - +