Active Blocking Scheme Learning for Entity Resolution

被引:3
|
作者
Shao, Jingyu [1 ]
Wang, Qing [1 ]
机构
[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
基金
澳大利亚研究理事会;
关键词
Entity resolution; Blocking scheme; Active learning; RECORD LINKAGE;
D O I
10.1007/978-3-319-93037-4_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Blocking is an important part of entity resolution. It aims to improve time efficiency by grouping potentially matched records into the same block. In the past, both supervised and unsupervised approaches have been proposed. Nonetheless, existing approaches have some limitations: either a large amount of labels are required or blocking quality is hard to be guaranteed. To address these issues, we propose a blocking scheme learning approach based on active learning techniques. With a limited label budget, our approach can learn a blocking scheme to generate high quality blocks. Two strategies called active sampling and active branching are proposed to select samples and generate blocking schemes efficiently. We experimentally verify that our approach outperforms several baseline approaches over four real-world datasets.
引用
收藏
页码:350 / 362
页数:13
相关论文
共 50 条
  • [1] A Blocking Scheme for Entity Resolution in the Semantic Web
    Costa, Gustavo de Assis
    Parente de Oliveira, Jose Maria
    IEEE 30TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS IEEE AINA 2016, 2016, : 1138 - 1145
  • [2] Unsupervised Bootstrapping of Active Learning for Entity Resolution
    Primpeli, Anna
    Bizer, Christian
    Keuper, Margret
    SEMANTIC WEB (ESWC 2020), 2020, 12123 : 215 - 231
  • [3] ENTITY RESOLUTION AND BLOCKING: A REVIEW
    Vidhya, K. A.
    Geetha, T. V.
    PROCEEDINGS OF THE 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC 2019), 2019, : 133 - 140
  • [4] Entity Resolution with Recursive Blocking
    Yu Shao-Qing
    BIG DATA RESEARCH, 2020, 19-20 (19-20)
  • [5] Entity Resolution with Iterative Blocking
    Whang, Steven Euijong
    Menestrina, David
    Koutrika, Georgia
    Theobald, Martin
    Garcia-Molina, Hector
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 219 - 231
  • [6] Active deep learning on entity resolution by risk sampling
    Nafa, Youcef
    Chen, Qun
    Chen, Zhaoqiang
    Lu, Xingyu
    He, Haiyang
    Duan, Tianyi
    Li, Zhanhuai
    KNOWLEDGE-BASED SYSTEMS, 2022, 236
  • [7] Unsupervised learning blocking keys technique for indexing Arabic entity resolution
    Alian, Marwah
    Awajan, Arafat
    Ramadan, Bandan
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 621 - 628
  • [8] Unsupervised learning blocking keys technique for indexing Arabic entity resolution
    Marwah Alian
    Arafat Awajan
    Bandan Ramadan
    International Journal of Speech Technology, 2019, 22 : 621 - 628
  • [9] Informativeness-Based Active Learning for Entity Resolution
    Christen, Victor
    Christen, Peter
    Rahm, Erhard
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 125 - 141
  • [10] DeepBlock: A Novel Blocking Approach for Entity Resolution using Deep Learning
    Javdani, Delaram
    Rahmani, Hossein
    Allahgholi, Milad
    Karimkhani, Fatemeh
    2019 5TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2019, : 41 - 44