Indexing uncertain categorical data

被引:0
|
作者
Singh, Sarvjeet [1 ]
Mayfield, Chris [1 ]
Prabhakar, Sunil [1 ]
Shah, Rahul [1 ]
Hambrusch, Susanne [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Uncertainty in categorical data is commonplace in many applications, including data cleaning, database integration, and biological annotation. In such domains, the correct value of an attribute is often unknown, but may be selected from a reasonable number of alternatives. Current database management systems do not provide a convenient means for representing or manipulating this type of uncertainty. In this paper we extend traditional systems to explicitly handle uncertainty in data values. We propose two index structures for efficiently searching uncertain categorical data, one based on the R-tree and another based on an inverted index structure. Using these structures, we provide a detailed description of the probabilistic equality queries they support. Experimental results using real and synthetic datasets demonstrate how these index structures can effectively improve the performance of queries through the use of internal probabilistic information.
引用
收藏
页码:591 / +
页数:2
相关论文
共 50 条
  • [1] Indexing Uncertain Categorical Data over Distributed Environment
    Benaissa, Adel
    Benbernou, Salima
    Ouziri, Mourad
    Sahri, Soror
    [J]. PROCEEDINGS OF THE 2015 CONFERENCE OF THE INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY, 2015, 89 : 1395 - 1400
  • [2] Indexing Uncertain Data
    Agarwal, Pankaj K.
    Cheng, Siu-Wing
    Tao, Yufei
    Yi, Ke
    [J]. PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 137 - 146
  • [3] On high dimensional indexing of uncertain data
    Aggarwal, Charu C.
    Yu, Philip S.
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1460 - +
  • [4] Indexing Uncertain Data for Supporting Range Queries
    Zhu, Rui
    Wang, Bin
    Wang, Guoren
    [J]. WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 72 - 83
  • [5] Indexing Uncertain Data in General Metric Spaces
    Angiulli, Fabrizio
    Fassetti, Fabio
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (09) : 1640 - 1657
  • [6] Indexing Metric Uncertain Data for Range Queries
    Chen, Lu
    Gao, Yunjun
    Li, Xinhan
    Jensen, Christian S.
    Chen, Gang
    Zheng, Baihua
    [J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 951 - 965
  • [7] A Naive Bayesian Classifier in Categorical Uncertain Data Streams
    Ge, Jiaqi
    Xia, Yuni
    Wang, Jian
    [J]. 2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2014, : 392 - 398
  • [8] An Ensemble of Naive Bayes Classifiers for Uncertain Categorical Data
    de Holanda Maia, Marcelo Rodrigues
    Plastino, Alexandre
    Freitas, Alex A.
    [J]. 2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1222 - 1227
  • [9] Uncertain spatial data handling: Modeling, indexing and query
    Li, Rui
    Bhanu, Bir
    Ravishankar, Chinya
    Kurth, Michael
    Ni, Jinfeng
    [J]. COMPUTERS & GEOSCIENCES, 2007, 33 (01) : 42 - 61
  • [10] Toward a New Model of Indexing Big Uncertain Data
    Omri, Asma
    Benouaret, Karim
    Omri, Mohammed Nazih
    Benslimane, Djamal
    [J]. 9TH INTERNATIONAL CONFERENCE ON MANAGEMENT OF EMERGENT DIGITAL ECOSYSTEMS (MEDES 2017), 2017, : 93 - 98