Distribution-free data density estimation in large-scale networks

被引:0
|
作者
Minqi Zhou
Rong Zhang
Weining Qian
Aoying Zhou
机构
[1] East China Normal University,Data Science and Engineering Institute
[2] Wuhan University,State Key Lab of Software Engineering
来源
关键词
distribution-free; data density estimation; random sampling;
D O I
暂无
中图分类号
学科分类号
摘要
Estimating the global data distribution in large-scale networks is an important issue and yet to be well addressed. It can benefit many applications, especially in the cloud computing era, such as load balancing analysis, query processing, and data mining. Inspired by the inversion method for random variate (number) generation, in this paper, we present a novel model called distribution-free data density estimation for large ring-based networks to achieve high estimation accuracy with low estimation cost regardless of the distribution models of the underlying data. This model generates random samples for any arbitrary distribution by sampling the global cumulative distribution function and is free from sampling bias. Armed with this estimation method, we can estimate data densities over both one-dimensional and multidimensional tuple sets, where each dimension could be either continuous or discrete as its domain. In large-scale networks, the key idea for distribution-free estimation is to sample a small subset of peers for estimating the global data distribution over the data domain. Algorithms on computing and sampling the global cumulative distribution function based on which the global data distribution is estimated are introduced with a detailed theoretical analysis. Our extensive performance study confirms the effectiveness and efficiency of our methods in large ring-based networks.
引用
收藏
页码:1220 / 1240
页数:20
相关论文
共 50 条
  • [21] DISTRIBUTION-FREE PERFORMANCE BOUND IN ERROR ESTIMATION
    DEVROYE, LP
    WAGNER, TJ
    IEEE TRANSACTIONS ON INFORMATION THEORY, 1976, 22 (05) : 586 - 587
  • [22] A DISTRIBUTION-FREE TABULAR CUSUM CHART FOR CORRELATED DATA WITH AUTOMATED VARIANCE ESTIMATION
    Lee, Joongsup
    Alexopoulos, Christos
    Goldsman, David
    Kim, Seong-Hee
    Tsui, Kwok-Leung
    Wilson, James R.
    2008 WINTER SIMULATION CONFERENCE, VOLS 1-5, 2008, : 417 - +
  • [23] Scale-Free Estimation of the Average State in Large-Scale Systems
    Niazi, Muhammad Umar B.
    Deplano, Diego
    Canudas-de-Wit, Carlos
    Kibangou, Alain Y.
    IEEE CONTROL SYSTEMS LETTERS, 2020, 4 (01): : 211 - 216
  • [24] Nazca: Detecting Malware Distribution in Large-Scale Networks
    Invernizzi, Luca
    Miskovic, Stanislav
    Torres, Ruben
    Saha, Sabyasachi
    Lee, Sung-Ju
    Mellia, Marco
    Kruegel, Christopher
    Vigna, Giovanni
    21ST ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2014), 2014,
  • [25] Reconfiguration of large-scale distribution networks for planning studies
    Gonzalez, A.
    Echavarren, F. M.
    Rouco, L.
    Gomez, T.
    Cabetas, J.
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2012, 37 (01) : 86 - 94
  • [26] Efficient key distribution protocols for large-scale networks
    Ku, WC
    Wang, SD
    INFORMATION NETWORKING IN ASIA, 2001, 3 : 261 - 270
  • [27] A Data-driven Mechanism for Large-scale Data Distribution
    Shi Peichang
    Li Yiying
    Ding Bo
    Jiang Longquan
    Liu Hui
    Zhang Jie
    2016 WORLD AUTOMATION CONGRESS (WAC), 2016,
  • [28] Scale-free topology for large-scale wireless sensor networks
    Wang, Lili
    Dang, Jianxun
    Jin, Yi
    Jin, Huihua
    2007 THIRD IEEE/IFIP INTERNATIONAL CONFERENCE IN CENTRAL ASIA ON INTERNET, 2007, : 21 - 25
  • [29] THE DISTRIBUTION OF QUASARS ON THE LARGE-SCALE AND THE SUPER LARGE-SCALE
    ZHOU, YY
    FANG, DP
    DENG, ZG
    HE, XT
    ASTROPHYSICAL JOURNAL, 1986, 311 (02): : 578 - 588
  • [30] Large-Scale Simultaneous Testing Using Kernel Density Estimation
    Ghosh, Santu
    Polansky, Alan M.
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2022, 84 (02): : 808 - 843