Improving Data Provenance Reconstruction via a Multi-Level Funneling Approach

被引:0
|
作者
Vasudevan, Subha [1 ]
Pfeffer, William [1 ]
Davis, Delmar [1 ]
Asuncion, Hazeline [1 ]
机构
[1] Univ Washington, Sch Sci Technol Engn & Math, Bothell, WA 98011 USA
基金
美国国家科学基金会;
关键词
data provenance; provenance reconstruction; Latent Dirichlet Allocation; Genetic Algorithm; Longest Common Subsequence; Statistical Re-clustering; Silhouette Coefficient;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The ease with which data can be created, copied, modified, and deleted over the Internet has made it increasingly difficult to determine the source of web data. Data provenance, which provides information about the origin and lineage of a dataset, assists in determining its genuineness and trustworthiness. Several data provenance techniques record provenance when the data is created or modified. However, many existing datasets have no recorded provenance. Provenance Reconstruction techniques attempt to generate an approximate provenance in these datasets. Current reconstruction techniques require timing metadata to reconstruct provenance. In this paper, we improve our multi-funneling technique, which combines existing techniques, including topic modeling, longest common subsequence, and genetic algorithm to achieve higher accuracy in reconstructing provenance without requiring timing metadata. In addition, we introduce novel funnels that are customized to the provided datasets, which further boosts precision and recall rates. We evaluated our approach with various experiments and compare the results of our approach with existing techniques. Finally, we present lessons learned, including the applicability of our approach to other datasets.
引用
收藏
页码:175 / 184
页数:10
相关论文
共 50 条
  • [31] QUANTITATIVE MULTI-LEVEL DISPLAY OF SCAN DATA VIA SMALL DIGITAL COMPUTER
    KEYES, WI
    BARBER, DC
    PHYSICS IN MEDICINE AND BIOLOGY, 1971, 16 (03): : 547 - &
  • [32] Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and Matching
    Zhang, Jinli
    Ren, Hongwei
    Jiang, Zongli
    Chen, Zheng
    Yang, Ziwei
    Matsubara, Yasuko
    Sakurai, Yasushi
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2024, 23 (04) : 579 - 590
  • [33] Multi-level Research on Youth Participation in the Haitian Reconstruction
    Pluim, Gary W. J.
    RESEARCH IN COMPARATIVE AND INTERNATIONAL EDUCATION, 2012, 7 (02): : 160 - 175
  • [34] Improving Question Generation with Multi-level Content Planning
    Xia, Zehua
    Gou, Qi
    Yu, Bowen
    Yu, Haiyang
    Huang, Fei
    Li, Yongbin
    Cam-Tu Nguyen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 800 - 814
  • [35] Scheduling for Data Centers with Multi-Level Data Locality
    Daghighi, Amirali
    Kavousi, Mohammadamir
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 927 - 936
  • [36] A High Capacity Multi-Level Approach for Reversible Data Hiding in Encrypted Images
    Ge, Haoli
    Chen, Yan
    Qian, Zhenxing
    Wang, Jianjun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (08) : 2285 - 2295
  • [37] A Data-Driven Approach for Multi-level Packing Problems in Manufacturing Industry
    Chen, Lei
    Tong, Xialiang
    Yuan, Mingxuan
    Zeng, Jia
    Chen, Lei
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 1762 - 1770
  • [38] A novel time series forecasting approach with multi-level data decomposing and modeling
    Han, Xuemei
    Xu, Congfu
    Shen, Huifeng
    Pan, Yunhe
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 1712 - 1716
  • [39] A varying-coefficient approach to estimating multi-level clustered data models
    You, Jinhong
    Wan, Alan T. K.
    Liu, Shu
    Zhou, Yong
    TEST, 2015, 24 (02) : 417 - 440
  • [40] Multi-level Block Differencing Approach for Reversible Data Hiding in Encrypted Images
    Yang, Cheng-Hsing
    Weng, Chi-Yao
    Hung, Chia-Ling
    Wang, Shiuh-Jeng
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 478 - 483