Random sampling for continuous streams with arbitrary updates

被引:10
|
作者
Tao, Yufei [1 ]
Lian, Xiang
Papadias, Dimitris
Hadjieleftheriou, Marios
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Sha Tin, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Clear Water Bay, Hong Kong, Peoples R China
[3] AT&T Labs, Florham Pk, NJ 07932 USA
关键词
sampling; selectivity estimation;
D O I
10.1109/TKDE.2007.250588
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The existing random sampling methods have at least one of the following disadvantages: they 1) are applicable only to certain update patterns, 2) entail large space overhead, or 3) incur prohibitive maintenance cost. These drawbacks prevent their effective application in stream environments ( where a relation is updated by a large volume of insertions and deletions that may arrive in any order), despite the considerable success of random sampling in conventional databases. Motivated by this, we develop several fully dynamic algorithms for obtaining random samples from individual relations, and from the join result of two tables. Our solutions can handle any update pattern with small space and computational overhead. We also present an in-depth analysis that provides valuable insight into the characteristics of alternative sampling strategies and leads to precision guarantees. Extensive experiments validate our theoretical findings and demonstrate the efficiency of our techniques in practice.
引用
收藏
页码:96 / 110
页数:15
相关论文
共 50 条
  • [41] RANDOM FIELDS AND RANDOM SAMPLING
    Dias, Sandra
    Temido, Maria Da Graca
    KYBERNETIKA, 2019, 55 (06) : 897 - 914
  • [42] Minimax sampling with arbitrary spaces
    Eldar, YC
    Dvorkind, TG
    ICECS 2004: 11th IEEE International Conference on Electronics, Circuits and Systems, 2004, : 559 - 562
  • [43] FRI Sampling With Arbitrary Kernels
    Urigueen, Jose Antonio
    Blu, Thierry
    Dragotti, Pier Luigi
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2013, 61 (21) : 5310 - 5323
  • [44] Age of Information Under Random Updates
    Kam, Clement
    Kompella, Sastry
    Ephremides, Anthony
    2013 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2013, : 66 - +
  • [45] τ-safety: A privacy model for sequential publication with arbitrary updates
    Anjum, Adeel
    Raschia, Guillaume
    Gelgon, Marc
    Khan, Abid
    Malik, Saif ur Rehman
    Ahmad, Naveed
    Ahmed, Mansoor
    Suhail, Sabah
    Alam, M. Masoom
    COMPUTERS & SECURITY, 2017, 66 : 20 - 39
  • [46] STABILITY OF STREAMS, APPEARING BY ARBITRARY BREAK DISINTEGRATION
    GALIN, GY
    KULIKOVSKII, AG
    PRIKLADNAYA MATEMATIKA I MEKHANIKA, 1975, 39 (01): : 95 - 102
  • [47] Wang-Landau sampling with cluster updates
    Körner, M
    Troyer, M
    COMPUTER SIMULATION STUDIES IN CONDENSED-MATTER PHYSICS XVI, 2006, 103 : 142 - +
  • [48] Identification of linear continuous-time systems under irregular and random output sampling
    Mu, Biqiang
    Guo, Jin
    Wang, Le Yi
    Yin, George
    Xu, Lijian
    Zheng, Wei Xing
    AUTOMATICA, 2015, 60 : 100 - 114
  • [49] Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials
    Carlisle, J. B.
    Dexter, F.
    Pandit, J. J.
    Shafer, S. L.
    Yentis, S. M.
    ANAESTHESIA, 2015, 70 (07) : 848 - 858
  • [50] SPECTRAL ESTIMATION OF CONTINUOUS-TIME STATIONARY-PROCESSES FROM RANDOM SAMPLING
    LII, KS
    MASRY, E
    STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 1994, 52 (01) : 39 - 64