Cost Reduction for Web-Based Data Imputation

被引:0
|
作者
Li, Zhixu [1 ]
Shang, Shuo [2 ]
Xie, Qing [1 ]
Zhang, Xiangliang [1 ]
机构
[1] King Abdullah Univ Sci & Technol, Jeddah, Saudi Arabia
[2] China Univ Petr, Dept Software Engn, Beijing, Peoples R China
关键词
Web-based Data Imputation; Imputation Query; Cost Reduction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Web-based Data Imputation enables the completion of incomplete data sets by retrieving absent field values from the Web. In particular, complete fields can be used as keywords in imputation queries for absent fields. However, due to the ambiguity of these keywords and the data complexity on the Web, different queries may retrieve different answers to the same absent field value. To decide the most probable right answer to each absent filed value, existing method issues quite a few available imputation queries for each absent value, and then vote on deciding the most probable right answer. As a result, we have to issue a large number of imputation queries for filling all absent values in an incomplete data set, which brings a large overhead. In this paper, we work on reducing the cost of Web-based Data Imputation in two aspects: First, we propose a query execution scheme which can secure the most probable right answer to an absent field value by issuing as few imputation queries as possible. Second, we recognize and prune queries that probably will fail to return any answers a priori. Our extensive experimental evaluation shows that our proposed techniques substantially reduce the cost of Web-based Imputation without hurting its high imputation accuracy.
引用
收藏
页码:438 / 452
页数:15
相关论文
共 50 条
  • [1] A web-based approach to data imputation
    Zhixu Li
    Mohamed A. Sharaf
    Laurianne Sitbon
    Shazia Sadiq
    Marta Indulska
    Xiaofang Zhou
    [J]. World Wide Web, 2014, 17 : 873 - 897
  • [2] A web-based approach to data imputation
    Li, Zhixu
    Sharaf, Mohamed A.
    Sitbon, Laurianne
    Sadiq, Shazia
    Indulska, Marta
    Zhou, Xiaofang
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2014, 17 (05): : 873 - 897
  • [3] Automatic Web-based relational data imputation
    Hailong Liu
    Zhanhuai Li
    Qun Chen
    Zhaoqiang Chen
    [J]. Frontiers of Computer Science, 2018, 12 : 1125 - 1139
  • [4] Automatic Web-based relational data imputation
    Liu, Hailong
    Li, Zhanhuai
    Chen, Qun
    Chen, Zhaoqiang
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (06) : 1125 - 1139
  • [5] Efficient Web-Based Data Imputation with Graph Model
    Tang, Yiwen
    Wang, Hongzhi
    Zhang, Shiwei
    Zhang, Huijun
    Shi, Ruoxi
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), 2017, 10179 : 213 - 226
  • [6] A probabilistic ranking framework for web-based relational data imputation
    Chen, Zhaoqiang
    Chen, Qun
    Li, Jiajun
    Li, Zhanhuai
    Chen, Lei
    [J]. INFORMATION SCIENCES, 2016, 355 : 152 - 168
  • [7] Improving the Quality of Web-Based Data Imputation With Crowd Intervention
    Gu, Binbin
    Li, Zhixu
    Liu, An
    Xu, Jiajie
    Zhao, Lei
    Zhou, Xiaofang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2534 - 2547
  • [8] A context-aware entity ranking method for web-based data imputation
    Chen, Zhao-Qiang
    Li, Jia-Jun
    Jiang, Chuan
    Liu, Hai-Long
    Chen, Qun
    Li, Zhan-Huai
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (09): : 1755 - 1766
  • [9] Web-based data acquisition
    Xu-dong Hu
    Hong Yu
    Ying Chen
    [J]. Journal of Zhejiang University-SCIENCE A, 2002, 3 (2): : 135 - 139
  • [10] Web-based Data Collection
    Hsiao, E-Ling
    Moore, David Richard
    [J]. TECHTRENDS, 2009, 53 (06) : 56 - 60