Joint Entity Resolution

被引:11
|
作者
Whang, Steven Euijong [1 ]
Garcia-Molina, Hector [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
关键词
D O I
10.1109/ICDE.2012.119
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Entity resolution (ER) is the problem of identifying which records in a database represent the same entity. Often, records of different types are involved (e.g., authors, publications, institutions, venues), and resolving records of one type can impact the resolution of other types of records. In this paper we propose a flexible, modular resolution framework where existing ER algorithms developed for a given record type can be plugged in and used in concert with other ER algorithms. Our approach also makes it possible to run ER on subsets of similar records at a time, important when the full data is too large to resolve together. We study the scheduling and coordination of the individual ER algorithms in order to resolve the full data set. We then evaluate our joint ER techniques on synthetic and real data and show the scalability of our approach.
引用
收藏
页码:294 / 305
页数:12
相关论文
共 50 条
  • [21] Measuring Entity Relatedness via Entity and Text Joint Embedding
    Zeng, Weixin
    Tang, Jiuyang
    Zhao, Xiang
    NEURAL PROCESSING LETTERS, 2019, 50 (02) : 1861 - 1875
  • [22] Entity Resolution with Crowd Errors
    Verroios, Vasilis
    Garcia-Molina, Hector
    2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 219 - 230
  • [23] Entity Resolution and Information Quality
    University of Arkansas, Little Rock, United States
    不详
    不详
    不详
    Entity Resolut. and Inf. Qual.,
  • [24] Entity Resolution with Evolving Rules
    Whang, Steven Euijong
    Garcia-Molina, Hector
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01): : 1326 - 1337
  • [25] (Almost) all of entity resolution
    Binette, Olivier
    Steorts, Rebecca C.
    SCIENCE ADVANCES, 2022, 8 (12)
  • [26] Scalable Focussed Entity Resolution
    Ranganath, B. N.
    Bhatnagar, Shalabh
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3570 - 3577
  • [27] Entity Resolution with Recursive Blocking
    Yu Shao-Qing
    BIG DATA RESEARCH, 2020, 19-20 (19-20)
  • [28] Entity resolution for probabilistic data
    Ayat, Naser
    Akbarinia, Reza
    Afsarmanesh, Hamideh
    Valduriez, Patrick
    INFORMATION SCIENCES, 2014, 277 : 492 - 511
  • [29] Linking Entity Resolution and Risk
    Creamer, German
    EASTERN ECONOMIC JOURNAL, 2011, 37 (01) : 150 - 164
  • [30] CrowdER: Crowdsourcing Entity Resolution
    Wang, Jiannan
    Kraska, Tim
    Franklin, Michael J.
    Feng, Jianhua
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11): : 1483 - 1494