The Five Generations of Entity Resolution on Web Data

被引:0
|
作者
Nikoletos, Konstantinos [1 ]
Ioannou, Ekaterini [2 ]
Papadakis, George [1 ]
机构
[1] Univ Athens, Athens, Greece
[2] Tilburg Univ, Tilburg, Netherlands
来源
WEB ENGINEERING, ICWE 2024 | 2024年 / 14629卷
关键词
Entity Resolution; Data Integration; LLMs;
D O I
10.1007/978-3-031-62362-2_46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Entity Resolution constitutes a core data integration task that has attracted a bulk of works on improving its effectiveness and time efficiency. This tutorial provides a comprehensive overview of the field, distinguishing relevant methods into five main generations. The first one targets Veracity in the context of structured data with a clean schema. The second generation extends its focus to cover Volume, as well, leveraging multi-core or massive parallelization to process large-scale datasets. The third generation addresses the additional challenge of Variety, targeting voluminous, noisy, semi-structured, and highly heterogeneous data from the Semantic Web. The fourth generation also tackles Velocity so as to process data collections of a continuously increasing volume. The latest works, though, belong to the fifth generation, involving pre-trained (large) language models which heavily rely on external knowledge to address all four Vs with high effectiveness.
引用
收藏
页码:469 / 473
页数:5
相关论文
共 50 条
  • [1] Entity Resolution in the Web of Data
    Stefanidis, Kostas
    Efthymiou, Vasilis
    Herschel, Melanie
    Christophides, Vassilis
    [J]. WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 203 - 203
  • [2] Entity Resolution in the Web of Data
    Department of Computer Science, University of Crete, Greece
    不详
    不详
    [J]. Synth. lect. semant. web : theory technol., 3 (1-124):
  • [3] Incremental Blocking for Entity Resolution over Web Streaming Data
    Araujo, Tiago Brasileiro
    Stefanidis, Kostas
    Santos Pires, Carlos Eduardo
    Nummenmaa, Jyrki
    da Nobrega, Thiago Pereira
    [J]. 2019 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2019), 2019, : 332 - 336
  • [4] Big Data Entity Resolution: From Highly to Somehow Similar Entity Descriptions in the Web
    Efthymiou, Vasilis
    Stefanidis, Kostas
    Christophides, Vassilis
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 401 - 410
  • [5] Entity resolution framework using rough set blocking for heterogeneous web of data
    Vidhya, K. A.
    Geetha, T. V.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (01) : 659 - 675
  • [6] Entity Resolution for Big Data
    Getoor, Lise
    Machanavajjhala, Ashwin
    [J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 1525 - 1525
  • [7] Entity resolution for probabilistic data
    Ayat, Naser
    Akbarinia, Reza
    Afsarmanesh, Hamideh
    Valduriez, Patrick
    [J]. INFORMATION SCIENCES, 2014, 277 : 492 - 511
  • [8] A Blocking Scheme for Entity Resolution in the Semantic Web
    Costa, Gustavo de Assis
    Parente de Oliveira, Jose Maria
    [J]. IEEE 30TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS IEEE AINA 2016, 2016, : 1138 - 1145
  • [9] Learning representations of Web entities for entity resolution
    Barbosa, Luciano
    [J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2019, 15 (03) : 346 - 358
  • [10] Scalable entity resolution for Web product descriptions
    Vandic, Damir
    Frasincar, Flavius
    Kaymak, Uzay
    Riezebos, Mark
    [J]. INFORMATION FUSION, 2020, 53 : 103 - 111