Entity Resolution for Big Data

被引:0
|
作者
Getoor, Lise [1 ]
Machanavajjhala, Ashwin [2 ]
机构
[1] Univ Maryland, Comp Sci Dept, College Pk, MD 20742 USA
[2] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Entity resolution (ER), the problem of extracting, matching and resolving entity mentions in structured and unstructured data, is a long-standing challenge in database management, information retrieval, machine learning, natural language processing and statistics. Accurate and fast entity resolution has huge practical implications in a wide variety of commercial, scientific and security domains. Despite the long history of work on entity resolution, there is still a surprising diversity of approaches, and lack of guiding theory. Meanwhile, in the age of big data, the need for high quality entity resolution is growing, as we are inundated with more and more data, all of which needs to be integrated, aligned and matched, before further utility can be extracted. In this tutorial, we bring together perspectives on entity resolution from a variety of fields, including databases, information retrieval, natural language processing and machine learning, to provide, in one setting, a survey of a large body of work. We discuss both the practical aspects and theoretical underpinnings of ER. We describe existing solutions, current challenges and open research problems. In addition to giving attendees a thorough understanding of existing ER models, algorithms and evaluation methods, the tutorial will cover important research topics such as scalable ER, active and lightly supervised ER, and query-driven ER.
引用
收藏
页码:1525 / 1525
页数:1
相关论文
共 50 条
  • [1] Entity Resolution in a Big Data Framework
    Kejriwal, Mayank
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 4243 - 4244
  • [2] Tutorial: Uncertain Entity Resolution Re-evaluating Entity Resolution in the Big Data Era
    Gal, Avigdor
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (13): : 1711 - 1712
  • [3] An Overview of End-to-End Entity Resolution for Big Data
    Christophides, Vassilis
    Efthymiou, Vasilis
    Palpanas, Themis
    Papadakis, George
    Stefanidis, Kostas
    [J]. ACM COMPUTING SURVEYS, 2021, 53 (06)
  • [4] Big Data Entity Resolution: From Highly to Somehow Similar Entity Descriptions in the Web
    Efthymiou, Vasilis
    Stefanidis, Kostas
    Christophides, Vassilis
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 401 - 410
  • [5] A Case Study on Entity Resolution for Distant Processing of Big Humanities Data
    Xu, Weijia
    Esteva, Maria
    Trelogan, Jessica
    Swinson, Todd
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [6] SDLER: stacked dedupe learning for entity resolution in big data era
    Alladoumbaye Ngueilbaye
    Hongzhi Wang
    Daouda Ahmat Mahamat
    Ibrahim A. Elgendy
    [J]. The Journal of Supercomputing, 2021, 77 : 10959 - 10983
  • [7] SDLER: stacked dedupe learning for entity resolution in big data era
    Ngueilbaye, Alladoumbaye
    Wang, Hongzhi
    Mahamat, Daouda Ahmat
    Elgendy, Ibrahim A.
    [J]. JOURNAL OF SUPERCOMPUTING, 2021, 77 (10): : 10959 - 10983
  • [9] A Similarity-Based Method for Entity Coreference Resolution in Big Data Environment
    Geng, Yushui
    Li, Peng
    Zhao, Jing
    [J]. PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON ADVANCED MATERIALS AND INFORMATION TECHNOLOGY PROCESSING (AMITP 2016), 2016, 60 : 110 - 116
  • [10] Parallel meta-blocking for scaling entity resolution over big heterogeneous data
    Efthymiou, Vasilis
    Papadakis, George
    Papastefanatos, George
    Stefanidis, Kostas
    Palpanas, Themis
    [J]. INFORMATION SYSTEMS, 2017, 65 : 137 - 157