Active Blocking Scheme Learning for Entity Resolution

被引:3
|
作者
Shao, Jingyu [1 ]
Wang, Qing [1 ]
机构
[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
基金
澳大利亚研究理事会;
关键词
Entity resolution; Blocking scheme; Active learning; RECORD LINKAGE;
D O I
10.1007/978-3-319-93037-4_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Blocking is an important part of entity resolution. It aims to improve time efficiency by grouping potentially matched records into the same block. In the past, both supervised and unsupervised approaches have been proposed. Nonetheless, existing approaches have some limitations: either a large amount of labels are required or blocking quality is hard to be guaranteed. To address these issues, we propose a blocking scheme learning approach based on active learning techniques. With a limited label budget, our approach can learn a blocking scheme to generate high quality blocks. Two strategies called active sampling and active branching are proposed to select samples and generate blocking schemes efficiently. We experimentally verify that our approach outperforms several baseline approaches over four real-world datasets.
引用
收藏
页码:350 / 362
页数:13
相关论文
共 50 条
  • [31] ERABQS: entity resolution based on active machine learning and balancing query strategy
    Mourad, Jabrane
    Hiba, Tabbaa
    Yassir, Rochd
    Imad, Hafidi
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, 62 (05) : 1347 - 1373
  • [32] Graph-Boosted Active Learning for Multi-source Entity Resolution
    Primpeli, Anna
    Bizer, Christian
    SEMANTIC WEB - ISWC 2021, 2021, 12922 : 182 - 199
  • [33] Enhancing Entity Resolution with a hybrid Active Machine Learning framework: Strategies for optimal learning in sparse datasets
    Jabrane, Mourad
    Tabbaa, Hiba
    Hadri, Aissam
    Hafidi, Imad
    INFORMATION SYSTEMS, 2024, 125
  • [34] Meta-Blocking: Taking Entity Resolution to the Next Level
    Papadakis, George
    Koutrika, Georgia
    Palpanas, Themis
    Nejdl, Wolfgang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (08) : 1946 - 1960
  • [35] Web-scale Blocking, Iterative and Progressive Entity Resolution
    Stefanidis, Kostas
    Christophides, Vassilis
    Efthymiou, Vasilis
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 1459 - 1462
  • [36] Incremental Blocking for Entity Resolution over Web Streaming Data
    Araujo, Tiago Brasileiro
    Stefanidis, Kostas
    Santos Pires, Carlos Eduardo
    Nummenmaa, Jyrki
    da Nobrega, Thiago Pereira
    2019 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2019), 2019, : 332 - 336
  • [37] A Blocking Framework for Entity Resolution in Highly Heterogeneous Information Spaces
    Papadakis, George
    Ioannou, Ekaterini
    Palpanas, Themis
    Niederee, Claudia
    Nejdl, Wolfgang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (12) : 2665 - 2682
  • [38] Gradual Machine Learning for Entity Resolution
    Hou, Boyi
    Chen, Qun
    Wang, Yanyan
    Nafa, Youcef
    Li, Zhanhuai
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (04) : 1803 - 1814
  • [39] Learning Distance Metrics for Entity Resolution
    Li, Lingli
    Shang, Xiaodan
    Li, Jinbao
    Hu, Jin
    IEEE ACCESS, 2018, 6 : 54900 - 54909
  • [40] Gradual Machine Learning for Entity Resolution
    Hou, Boyi
    Chen, Qun
    Shen, Jiquan
    Liu, Xin
    Zhong, Ping
    Wang, Yanyan
    Chen, Zhaoqiang
    Li, Zhanhuai
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3526 - 3530