A Confidence-based Entity Resolution Approach with Incomplete Information

被引:0
|
作者
Gu, Qi [1 ,2 ]
Zhang, Yan [1 ]
Cao, Jian [1 ]
Xu, Guandong [3 ]
Cuzzocrea, Alfredo [4 ,5 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Nantong Univ, Sch Comp Sci & Technol, Nantong, Peoples R China
[3] Univ Technol Sydney, Sydney, NSW, Australia
[4] ICAR CNR, Cosenza, Italy
[5] Univ Calabria, Cosenza, Italy
关键词
Entity Resolution; Cover Rate; Confidence; Accuracy;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Entity resolution identifies entities from different data sources that refer to the same real-world entity and it is an important prerequisite for integrating data from multiple sources. Entity resolution mainly relies on similarity measures on data records. Unfortunately, the data quality of data sources is not so good in practice. Especially web data sources often only provide incomplete information, which leads to the difficulties of direct applying similarity measures to identify the same entities. In order to address this problem, the concept of confidence is introduced to measure the trustworthy of the similarity calculation. An adaptive rule-based approach is used to calculate the similarity between records and its confidence is also derived. Then the similarity and confidence are propagated on the entity relational graph until fix point is reached. Finally, any pair of two records can be determined as matched or unmatched based on a threshold. We performed a series of experiments on real data sets and experiment results show that our approach has a better performance comparing with others.
引用
收藏
页码:97 / 103
页数:7
相关论文
共 50 条
  • [41] Confidence-based reasoning in stochastic constraint programming
    Rossi, Roberto
    Hnich, Brahim
    Tarim, S. Armagan
    Prestvvich, Steven
    ARTIFICIAL INTELLIGENCE, 2015, 228 : 129 - 152
  • [42] Confidence Level Estimation and Design Sensitivity Analysis for Confidence-based RBDO\\
    Cho, Hyunkyoo
    Choi, K. K.
    Lee, Ikjin
    Gorsich, David
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE 2012, VOL 3, PTS A AND B, 2012, : 1227 - +
  • [43] Classifying with confidence from incomplete information
    Applied Physics Laboratory, Johns Hopkins University, Laurel, MD 20723, United States
    不详
    不详
    不详
    J. Mach. Learn. Res., 2013, (3561-3589):
  • [44] Classifying With Confidence From Incomplete Information
    Parrish, Nathan
    Anderson, Hyrum S.
    Gupta, Maya R.
    Hsiao, Dun Yu
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 3561 - 3589
  • [45] Distantly Supervised Named Entity Recognition via Confidence-Based Multi-Class Positive and Unlabeled Learning
    Zhou, Kang
    Li, Yuepei
    Li, Qi
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7198 - 7211
  • [46] Reliability measure approach for confidence-based design optimization under insufficient input data
    Yongsu Jung
    Hyunkyoo Cho
    Ikjin Lee
    Structural and Multidisciplinary Optimization, 2019, 60 : 1967 - 1982
  • [47] Reliability measure approach for confidence-based design optimization under insufficient input data
    Jung, Yongsu
    Cho, Hyunkyoo
    Lee, Ikjin
    STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2019, 60 (05) : 1967 - 1982
  • [48] Targeting uncertainty in smart CPS by confidence-based logic
    Bures, Tomas
    Hnetynka, Petr
    Plasil, Frantisek
    Skoda, Dominik
    Kofron, Jan
    Al Ali, Rima
    Gerostathopoulos, Ilias
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 181
  • [49] Confidence-Based Work Stealing in Parallel Constraint Programming
    Chu, Geoffrey
    Schulte, Christian
    Stuckey, Peter J.
    PRINCIPLES AND PRACTICE OF CONSTRAINT PROGRAMMING, 2009, 5732 : 226 - +
  • [50] Confidence-Based Algorithm Parameter Tuning with Dynamic Resampling
    da Cruz, Andre Rodrigues
    Caldeira Takahashi, Ricardo Hiroshi
    OPTIMIZATION, LEARNING ALGORITHMS AND APPLICATIONS, OL2A 2022, 2022, 1754 : 309 - 326