A Hybrid Data Cleaning Framework Using Markov Logic Networks (Extended Abstract)

被引:1
|
作者
Ge, Congcong [1 ]
Gao, Yunjun [1 ]
Miao, Xiaoye [2 ]
Yao, Bin [3 ]
Wang, Haobo [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou, Peoples R China
[2] Zhejiang Univ, Ctr Data Sci, Hangzhou, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
关键词
D O I
10.1109/ICDE51399.2021.00258
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the growth of dirty data, data cleaning turns into a crux of data analysis. In this paper, we propose a novel hybrid data cleaning framework, termed as MLNClean, which is capable of learning instantiated rules to supplement the insufficient integrity constraints. MLNClean consists of two steps, i.e., pre processing and two stage data cleaning. In the pre-processing step, MLNClean first infers a set of probable instantiated rules according to Markov logic network (MLN) and then builds a two-layer MLN index to generate multiple data versions and facilitate the cleaning process. In the two-stage data cleaning step, MLNClean first presents a concept of reliability score to clean errors within each data version separately, and then, it eliminates the conflict values among different data versions using a novel concept of fusion score. Considerable experimental results on both real and synthetic scenarios demonstrate the effectiveness of MLNClean.
引用
收藏
页码:2344 / 2345
页数:2
相关论文
共 50 条
  • [21] Learning Mixtures of Markov Chains from Aggregate Data with Structural Constraints (Extended Abstract)
    Luo, Dixin
    Xu, Hongteng
    Zhen, Yi
    Dilkina, Bistra
    Zha, Hongyuan
    Yang, Xiaokang
    Zhang, Wenjun
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 35 - 36
  • [22] Understanding Convolutional Networks Using Linear Interpreters Extended Abstract
    Michelini, Pablo Navarrete
    Liu, Hanwen
    Lu, Yunhua
    Jiang, Xingqun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4186 - 4189
  • [23] NeoMaPy: A Parametric Framework for Reasoning with MAP Inference on Temporal Markov Logic Networks
    David, Victor
    Fournier-S'niehotta, Raphael
    Travers, Nicolas
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 400 - 409
  • [24] Using Social Networks to Predict Changes in Health Extended Abstract
    Jung, Karen S.
    Tonguz, Ozan K.
    2017 12TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION AND PERSONALIZATION (SMAP 2017), 2017, : 12 - 13
  • [25] Semantic Analysis of Spoken Input Using Markov Logic Networks
    Despotovic, Vladimir
    Walter, Oliver
    Haeb-Umbach, Reinhold
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1859 - 1863
  • [26] Probabilistic Modeling of Failure Dependencies Using Markov Logic Networks
    Ghosh, Shalini
    Steiner, Wilfried
    Denker, Grit
    Lincoln, Patrick
    2013 IEEE 19TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING (PRDC 2013), 2013, : 162 - 171
  • [27] Relevance Estimation of Traffic Elements Using Markov Logic Networks
    Nienhueser, Dennis
    Gumpp, Thomas
    Zoellner, J. Marius
    2011 14TH INTERNATIONAL IEEE CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2011, : 1659 - 1664
  • [28] Unusual human activity detection using Markov Logic Networks
    Kapoor, Aditi
    Biswas, K. K.
    Hanmandlu, M.
    2017 IEEE INTERNATIONAL CONFERENCE ON IDENTITY, SECURITY AND BEHAVIOR ANALYSIS (ISBA), 2017,
  • [29] LEARNING COMPLEX EVENT MODELS USING MARKOV LOGIC NETWORKS
    Kardas, Karani
    Ulusoy, Ilkay
    Cicekli, Nihan Kesim
    ELECTRONIC PROCEEDINGS OF THE 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2013,
  • [30] Scalable Training of Markov Logic Networks Using Approximate Counting
    Sarkhel, Somdeb
    Venugopal, Deepak
    Tuan Anh Pham
    Singla, Parag
    Gogate, Vibhav
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1067 - 1073