Improving Data Provenance Reconstruction via a Multi-Level Funneling Approach

被引:0
|
作者
Vasudevan, Subha [1 ]
Pfeffer, William [1 ]
Davis, Delmar [1 ]
Asuncion, Hazeline [1 ]
机构
[1] Univ Washington, Sch Sci Technol Engn & Math, Bothell, WA 98011 USA
基金
美国国家科学基金会;
关键词
data provenance; provenance reconstruction; Latent Dirichlet Allocation; Genetic Algorithm; Longest Common Subsequence; Statistical Re-clustering; Silhouette Coefficient;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The ease with which data can be created, copied, modified, and deleted over the Internet has made it increasingly difficult to determine the source of web data. Data provenance, which provides information about the origin and lineage of a dataset, assists in determining its genuineness and trustworthiness. Several data provenance techniques record provenance when the data is created or modified. However, many existing datasets have no recorded provenance. Provenance Reconstruction techniques attempt to generate an approximate provenance in these datasets. Current reconstruction techniques require timing metadata to reconstruct provenance. In this paper, we improve our multi-funneling technique, which combines existing techniques, including topic modeling, longest common subsequence, and genetic algorithm to achieve higher accuracy in reconstructing provenance without requiring timing metadata. In addition, we introduce novel funnels that are customized to the provided datasets, which further boosts precision and recall rates. We evaluated our approach with various experiments and compare the results of our approach with existing techniques. Finally, we present lessons learned, including the applicability of our approach to other datasets.
引用
收藏
页码:175 / 184
页数:10
相关论文
共 50 条
  • [1] A Multi-Level Funneling Approach to Data Provenance Reconstruction
    Aierken, Ailifan
    Davis, Delmar B.
    Zhang, Qi
    Gupta, Kriti
    Wong, Alex
    Asuncion, Hazeline U.
    2014 IEEE 10TH INTERNATIONAL CONFERENCE ON ESCIENCE WORKSHOPS (ESCIENCE 2014), VOL 2, 2014, : 71 - 74
  • [2] Multi-level corpectomies and reconstruction via a single posterolateral approach
    Brennan, Ryan P.
    Altstadt, Thomas J.
    Rodgers, Richard B.
    Horn, Eric M.
    JOURNAL OF CLINICAL NEUROSCIENCE, 2010, 17 (11) : 1399 - 1404
  • [3] Towards a multi-level modeling approach for reconstruction application to medical data
    Mari, JL
    Sequeira, J
    PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-4: BUILDING NEW BRIDGES AT THE FRONTIERS OF ENGINEERING AND MEDICINE, 2001, 23 : 2577 - 2580
  • [4] DATA PROVENANCE ARCHITECTURE TO SUPPORT INFORMATION ASSURANCE IN A MULTI-LEVEL SECURE ENVIRONMENT
    Moitra, Abha
    Barnett, Bruce
    Crapo, Andrew
    Dill, Stephen J.
    MILCOM 2009 - 2009 IEEE MILITARY COMMUNICATIONS CONFERENCE, VOLS 1-4, 2009, : 2076 - +
  • [5] SIMS:: a multi-level approach to surface reconstruction with sparse implicits
    Patane, Giuseppe
    IEEE INTERNATIONAL CONFERENCE ON SHAPE MODELING AND APPLICATIONS 2006, PROCEEDINGS, 2006, : 222 - +
  • [6] MULTI-TIERED APPROACH TO IMPROVING THE RELIABILITY OF MULTI-LEVEL CELL PRAM
    Yang, Chengen
    Emre, Yunus
    Cao, Yu
    Chakrabarti, Chaitali
    2012 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2012, : 114 - 119
  • [7] Multi-level modelling via stochastic multi-level multiset rewriting
    Oury, Nicolas
    Plotkin, Gordon
    MATHEMATICAL STRUCTURES IN COMPUTER SCIENCE, 2013, 23 (02) : 471 - 503
  • [8] Operationalizing Data Governance via Multi-level Metadata Management
    van Helvoirt, Stefhan
    Weigand, Hans
    OPEN AND BIG DATA MANAGEMENT AND INNOVATION, I3E 2015, 2015, 9373 : 160 - 172
  • [9] Video access control via multi-level data hiding
    Wu, M
    Yu, HH
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 381 - 384
  • [10] A Robot Control Approach Based on Multi-level Data Fusion
    Kang, Sangseung
    Kim, Jaehong
    Sohn, Joochan
    INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 1777 - 1780