A Hybrid Data Cleaning Framework Using Markov Logic Networks (Extended Abstract)

被引:1
|
作者
Ge, Congcong [1 ]
Gao, Yunjun [1 ]
Miao, Xiaoye [2 ]
Yao, Bin [3 ]
Wang, Haobo [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou, Peoples R China
[2] Zhejiang Univ, Ctr Data Sci, Hangzhou, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
关键词
D O I
10.1109/ICDE51399.2021.00258
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the growth of dirty data, data cleaning turns into a crux of data analysis. In this paper, we propose a novel hybrid data cleaning framework, termed as MLNClean, which is capable of learning instantiated rules to supplement the insufficient integrity constraints. MLNClean consists of two steps, i.e., pre processing and two stage data cleaning. In the pre-processing step, MLNClean first infers a set of probable instantiated rules according to Markov logic network (MLN) and then builds a two-layer MLN index to generate multiple data versions and facilitate the cleaning process. In the two-stage data cleaning step, MLNClean first presents a concept of reliability score to clean errors within each data version separately, and then, it eliminates the conflict values among different data versions using a novel concept of fusion score. Considerable experimental results on both real and synthetic scenarios demonstrate the effectiveness of MLNClean.
引用
收藏
页码:2344 / 2345
页数:2
相关论文
共 50 条
  • [1] A Hybrid Data Cleaning Framework Using Markov Logic Networks
    Ge, Congcong
    Gao, Yunjun
    Miao, Xiaoye
    Yao, Bin
    Wang, Haobo
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (05) : 2048 - 2062
  • [2] A Logic-based Explanation Generation Framework for Classical and Hybrid Planning Problems (Extended Abstract)
    Vasileiou, Stlylianos Loukas
    Yeoh, William
    Tran, Son
    Kumar, Ashwin
    Cashmore, Michael
    Magazzeni, Daniele
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6985 - 6989
  • [3] Data fusion in data federation using modified discriminative Markov logic networks
    Hema, M. S.
    Guptha, M. Nageswara
    INTERNATIONAL JOURNAL OF ADVANCED AND APPLIED SCIENCES, 2016, 3 (08): : 78 - 84
  • [4] Markov Logic Networks in the Analysis of Genetic Data
    Sakhanenko, Nikita A.
    Galas, David J.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2010, 17 (11) : 1491 - 1508
  • [5] Data Conflict Resolution with Markov Logic Networks
    Li Qing-zhong
    Zhang Yong-xin
    Cui Li-zhen
    IMCIC'11: THE 2ND INTERNATIONAL MULTI-CONFERENCE ON COMPLEXITY, INFORMATICS AND CYBERNETICS, VOL II, 2011, : 46 - 51
  • [6] A Lightweight Framework for Research Data Management Extended Abstract
    Nikolov, Dimitar
    Tuna, Esen
    PEARC '19: PROCEEDINGS OF THE PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING ON RISE OF THE MACHINES (LEARNING), 2019,
  • [7] Localization Schemes: A Framework for Proving Mixing Bounds for Markov Chains (extended abstract)
    Chen, Yuansi
    Eldan, Ronen
    2022 IEEE 63RD ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2022, : 110 - 122
  • [8] Flash in Action: Scalable Spatial Data Analysis Using Markov Logic Networks
    Sabek, Ibrahim
    Musleh, Mashaal
    Mokbel, Mohamed F.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (12): : 1834 - 1837
  • [9] Hybrid Modeling of Metabolic-Regulatory Networks (Extended Abstract)
    Liu, Lin
    Bockmayr, Alexander
    HYBRID SYSTEMS BIOLOGY (HSB 2019), 2019, 11705 : 177 - 180
  • [10] A generalizable knowledge framework for semantic indoor mapping based on Markov logic networks and data driven MCMC
    Liu, Ziyuan
    von Wichert, Georg
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2014, 36 : 42 - 56