Distribution-free data density estimation in large-scale networks

被引:0
|
作者
Minqi Zhou
Rong Zhang
Weining Qian
Aoying Zhou
机构
[1] East China Normal University,Data Science and Engineering Institute
[2] Wuhan University,State Key Lab of Software Engineering
来源
关键词
distribution-free; data density estimation; random sampling;
D O I
暂无
中图分类号
学科分类号
摘要
Estimating the global data distribution in large-scale networks is an important issue and yet to be well addressed. It can benefit many applications, especially in the cloud computing era, such as load balancing analysis, query processing, and data mining. Inspired by the inversion method for random variate (number) generation, in this paper, we present a novel model called distribution-free data density estimation for large ring-based networks to achieve high estimation accuracy with low estimation cost regardless of the distribution models of the underlying data. This model generates random samples for any arbitrary distribution by sampling the global cumulative distribution function and is free from sampling bias. Armed with this estimation method, we can estimate data densities over both one-dimensional and multidimensional tuple sets, where each dimension could be either continuous or discrete as its domain. In large-scale networks, the key idea for distribution-free estimation is to sample a small subset of peers for estimating the global data distribution over the data domain. Algorithms on computing and sampling the global cumulative distribution function based on which the global data distribution is estimated are introduced with a detailed theoretical analysis. Our extensive performance study confirms the effectiveness and efficiency of our methods in large ring-based networks.
引用
收藏
页码:1220 / 1240
页数:20
相关论文
共 50 条
  • [31] Topological optimization of the large-scale data transmission networks
    V. M. Vishnevskii
    A. O. Leonov
    N. I. Levchenko
    A. M. Stepanov
    [J]. Automation and Remote Control, 2007, 68 : 760 - 772
  • [32] Making Large-Scale Networks from fMRI Data
    Schmittmann, Verena D.
    Jahfari, Sara
    Borsboom, Denny
    Savi, Alexander O.
    Waldorp, Lourens J.
    [J]. PLOS ONE, 2015, 10 (09):
  • [33] Anomaly detection in large-scale data stream networks
    Duc-Son Pham
    Venkatesh, Svetha
    Lazarescu, Mihai
    Budhaditya, Saha
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (01) : 145 - 189
  • [34] Topological optimization of the large-scale data transmission networks
    Vishnevskii, V. M.
    Leonov, A. O.
    Levchenko, N. I.
    Stepanov, A. M.
    [J]. AUTOMATION AND REMOTE CONTROL, 2007, 68 (05) : 760 - 772
  • [35] Anomaly detection in large-scale data stream networks
    Duc-Son Pham
    Svetha Venkatesh
    Mihai Lazarescu
    Saha Budhaditya
    [J]. Data Mining and Knowledge Discovery, 2014, 28 : 145 - 189
  • [36] Data mining and forecasting in large-scale telecommunication networks
    Sasisekharan, R
    Seshadri, V
    Weiss, SM
    [J]. IEEE EXPERT-INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1996, 11 (01): : 37 - 43
  • [37] Large-scale density and velocity field reconstructions with neural networks
    Ganeshaiah Veena, Punyakoti
    Lilow, Robert
    Nusser, Adi
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2023, 522 (04) : 5291 - 5307
  • [38] Coverage density as a dominant property of large-scale sensor networks
    Yadgar, Osher
    Kraus, Sarit
    [J]. COOPERATIVE INFORMATION AGENTS X, PROCEEDINGS, 2006, 4149 : 138 - 152
  • [39] Dynamic density and flow reconstruction in large-scale urban networks using heterogeneous data sources
    Rodriguez-Vega, Martin
    Canudas-de-Wit, Carlos
    Fourati, Hassen
    [J]. Transportation Research Part C: Emerging Technologies, 2022, 137
  • [40] Dynamic density and flow reconstruction in large-scale urban networks using heterogeneous data sources
    Rodriguez-Vega, Martin
    Canudas-de-Wit, Carlos
    Fourati, Hassen
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2022, 137