Cluster analysis of massive datasets in astronomy

被引:0
|
作者
Woncheol Jang
Martin Hendry
机构
[1] University of Georgia,Department of Epidemiology and Biostatistics
[2] University of Glasgow,Department of Physics and Astronomy
来源
Statistics and Computing | 2007年 / 17卷
关键词
Density contour cluster; Level set; Clustering; Fast Fourier transform;
D O I
暂无
中图分类号
学科分类号
摘要
Clusters of galaxies are a useful proxy to trace the distribution of mass in the universe. By measuring the mass of clusters of galaxies on different scales, one can follow the evolution of the mass distribution (Martínez and Saar, Statistics of the Galaxy Distribution, 2002). It can be shown that finding galaxy clusters is equivalent to finding density contour clusters (Hartigan, Clustering Algorithms, 1975): connected components of the level set Sc≡{f>c} where f is a probability density function. Cuevas et al. (Can. J. Stat. 28, 367–382, 2000; Comput. Stat. Data Anal. 36, 441–459, 2001) proposed a nonparametric method for density contour clusters, attempting to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data.
引用
收藏
页码:253 / 262
页数:9
相关论文
共 50 条
  • [1] Cluster analysis of massive datasets in astronomy
    Jang, Woncheol
    Hendry, Martin
    [J]. STATISTICS AND COMPUTING, 2007, 17 (03) : 253 - 262
  • [2] Analysis of massive data in astronomy
    Shin, Min-Su
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (06) : 1107 - 1116
  • [3] 3-D visualizations of massive astronomy datasets with a Digital Dome
    Liu, CT
    Abbott, B
    Emmart, C
    MacLow, MM
    Shara, M
    Summers, FJ
    Tyson, ND
    [J]. VIRTUAL OBSERVATORIES OF THE FUTURE, PROCEEDINGS, 2001, 225 : 188 - 191
  • [4] Regression analysis for massive datasets
    Fan, Tsai-Hung
    Lin, Dennis K. J.
    Cheng, Kuang-Fu
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 61 (03) : 554 - 562
  • [5] Massive datasets
    Kettenring, Jon R.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2009, 1 (01) : 25 - 32
  • [6] Cataloging and mining massive datasets for science data analysis
    Fayyad, UM
    Smyth, P
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 1999, 8 (03) : 589 - 610
  • [7] Tests and variables selection on regression analysis for massive datasets
    Fan, Tsai-Hung
    Cheng, Kuang-Fu
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 63 (03) : 811 - 819
  • [8] Exploratory Trajectory Analysis for Massive Historical AIS Datasets
    Graser, Anita
    Dragaschnig, Melitta
    Widhalm, Peter
    Koller, Hannes
    Braendle, Norbert
    [J]. 2020 21ST IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2020), 2020, : 252 - 257
  • [9] Mining of Massive Datasets
    Richter, Lothar
    [J]. BIOMETRICS, 2018, 74 (04) : 1520 - 1521
  • [10] Mining of massive datasets
    Rajaraman, Anand
    Ullman, Jeffrey David
    [J]. Mining of Massive Datasets, 2011, 9781107015357 : 1 - 315