Analyzing Data Distribution for Dynamic Data Sets

被引:0
|
作者
Shi, Yong [1 ]
Kim, Sunpil [1 ]
机构
[1] Kennesaw State Univ, Dept Comp Sci, Kennesaw, GA 30144 USA
关键词
D O I
暂无
中图分类号
F [经济];
学科分类号
02 ;
摘要
In this paper, we discuss the data distribution of data sets that change constantly. In our previous work [1], we analyze the change of the distribution in multi-dimensional data space, and propose an approach to processing the multi-dimensional data sets. Similarity search problems define the distances between data points and a given query point Q, efficiently and effectively selecting data points which are closest to Q. Clusters are subgroups of data points from a data set that are similar to each other within the same subgroup. In [1], we propose an approach to reconstruct clusters based on K nearest neighbor search results for dynamic data sets. However, in high dimensional spaces, for a given cluster, not all dimensions may be relevant to it, and natural clusters might not exist in the full data space. In this paper we extend our work in subspace area, and design an algorithm to detect the subclusters that are readjusted continuously when the data set changes and new query requests come. The reconstructed subclusters can help improve the performance of the future K nearest search process.
引用
收藏
页码:1046 / 1052
页数:7
相关论文
共 50 条
  • [1] A new FCA algorithm enabling analyzing of complex and dynamic data sets
    Gajdos, Petr
    Snasel, Vaclav
    [J]. SOFT COMPUTING, 2014, 18 (04) : 683 - 694
  • [2] A new FCA algorithm enabling analyzing of complex and dynamic data sets
    Petr Gajdoš
    Václav Snášel
    [J]. Soft Computing, 2014, 18 : 683 - 694
  • [3] Data sets, partitions, and characters: Philosophies and procedures for analyzing multiple data sets
    Ballard, JWO
    Thayer, MK
    Newton, AF
    Grismer, ER
    [J]. SYSTEMATIC BIOLOGY, 1998, 47 (03) : 367 - 396
  • [4] Sets, bags, and rock and roll - Analyzing large data sets of network data
    McHugh, J
    [J]. COMPUTER SECURITY ESORICS 2004, PROCEEDINGS, 2004, 3193 : 407 - 422
  • [5] Analyzing large data sets in cosmology
    Szalay, AS
    Matsubara, T
    [J]. STATISTICAL CHALLENGES IN ASTRONOMY, 2003, : 161 - 174
  • [6] Managing and Analyzing Large Data Sets
    Snyder, Derrick
    Burress, Brian
    [J]. 2011 FUTURE OF INSTRUMENTATION INTERNATIONAL WORKSHOP (FIIW), 2011,
  • [7] A New Modification of the Weibull Distribution: Model, Theory, and Analyzing Engineering Data Sets
    Alshanbari, Huda M.
    Ahmad, Zubair
    El-Bagoury, Abd Al-Aziz Hosni
    Odhah, Omalsad Hamood
    Rao, Gadde Srinivasa
    [J]. SYMMETRY-BASEL, 2024, 16 (05):
  • [8] ANALYZING DATA SETS FROM LOGNORMAL DISTRIBUTION WHEN UNDETECTED OBSERVATIONS PRESENT
    Chen, Zhenmin
    [J]. 14TH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY AND QUALITY IN DESIGN, PROCEEDINGS, 2008, : 279 - 283
  • [9] Analyzing dynamic data: A tutorial
    Revelle, William
    Wilt, Joshua
    [J]. PERSONALITY AND INDIVIDUAL DIFFERENCES, 2019, 136 : 38 - 51
  • [10] The Flag Manifold as a Tool for Analyzing and Comparing Sets of Data Sets
    Ma, Xiaofeng
    Kirby, Michael
    Peterson, Chris
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 4168 - 4177