Decomposition and Classification of Stellar Spectra Based on t-SNE

被引:4
|
作者
Jiang Bin [1 ]
Zhao Zi-liang [1 ]
Wang Shu-ting [1 ]
Wei Ji-yu [1 ]
Qu Mei-xia [1 ]
机构
[1] Shandong Univ, Sch Mech Elect & Informat Engn, Weihai 264209, Peoples R China
关键词
Manifold learning; Stellar spectral classification; Data reduction; K-Nearest neighbor algorithm;
D O I
10.3964/j.issn.1000-0593(2020)09-2913-05
中图分类号
O433 [光谱学];
学科分类号
0703 ; 070302 ;
摘要
With the development of astronomy and the improvement of telescope observation ability, many large sky survey telescopes have produced petabytes of stellar spectra. Stellar spectra are a kind of complex frequency domain signal, which is usually composed of continuous spectrum and absorption lines. The differences are mainly caused by the effective temperature, surface gravity acceleration and chemical abundance of elements of stars. The automatic classification of stellar spectra is an important part of astronomical data processing and the basis of studying stellar evolution and parameter measurement. The massive stellar spectra require efficient and accurate classification methods. The traditional manual classification methods have the disadvantages of low speed and accuracy, which cannot meet the actual needs of automatic classification of massive stellar spectra. Machine learning algorithms have been widely used in spectra classification. A significant feature of the stellar spectra is the high data dimension. Dimensionality reduction can not only achieve feature extraction, but also reduce the amount of computation, which is the primary task of spectra classification. The traditional linear dimensionality reduction method only reduces the spectra according to the variance, and different types of spectra will cross in the feature space, while manifold learning can produce good classification boundaries to avoid overlap, which is conducive to subsequent classification. In this paper, the distribution of spectra in high dimensional space and the principle of manifold learning to dimensionality reduction of high dimensional linear data are studied. The effects of two dimensionality reduction methods: t-SNE and principal component analysis were compared and the improved k-nearest neighbor algorithm based on the correlation distance of attribute values was used for spectra classification. Python and Scikit-learn were used to implement the algorithm. 12 000 low signal/noise stellar spectra from SDSS were tested and high precision automatic processing and classification of spectral data are realized finally. Experimental results show that the t-SNE method based on manifold learning can restore the low-dimensional manifold structure in high dimensional spectral data. The low-dimensional manifold features in high-dimensional spaces are found and the corresponding embedded mappings are solved. In the process of dimension reduction, the differences between spectral samples of different categories are preserved to the greatest extent. The three-dimensional visualization of the experimental results shows that PCA can lead to the crossover of the distribution of stellar spectra of different categories, while the t-SNE algorithm can produce more obvious category boundaries. The k-nearest neighbor algorithm based on attribute value correlation distance can achieve satisfactory classification accuracy on test data sets after feature extraction. The method used in this paper can also be applied to the automatic classification of massive spectra generated by other telescopes and data mining of rare objects.
引用
收藏
页码:2913 / 2917
页数:5
相关论文
共 12 条
  • [1] SPECTRAL CLASSIFICATION AND REDSHIFT MEASUREMENT FOR THE SDSS-III BARYON OSCILLATION SPECTROSCOPIC SURVEY
    Bolton, Adam S.
    Schlegel, David J.
    Aubourg, Eric
    Bailey, Stephen
    Bhardwaj, Vaishali
    Brownstein, Joel R.
    Burles, Scott
    Chen, Yan-Mei
    Dawson, Kyle
    Eisenstein, Daniel J.
    Gunn, James E.
    Knapp, G. R.
    Loomis, Craig P.
    Lupton, Robert H.
    Maraston, Claudia
    Muna, Demitri
    Myers, Adam D.
    Olmstead, Matthew D.
    Padmanabhan, Nikhil
    Paris, Isabelle
    Percival, Will J.
    Petitjean, Patrick
    Rockosi, Constance M.
    Ross, Nicholas P.
    Schneider, Donald P.
    Shu, Yiping
    Strauss, Michael A.
    Thomas, Daniel
    Tremonti, Christy A.
    Wake, David A.
    Weaver, Benjamin A.
    Wood-Vasey, W. Michael
    [J]. ASTRONOMICAL JOURNAL, 2012, 144 (05):
  • [2] CHENG N, 2014, PUBLICATIONS ASTRONO, V31, P1, DOI DOI 10.3969/J.ISSN.2095-1736.2014.04.001
  • [3] Gisbrecht A, 2012, INT JOINT C NEUR NET
  • [4] SDSS-DR12 bulk stellar spectral classification: Artificial neural networks approach
    Kheirdastan, S.
    Bazarghan, M.
    [J]. ASTROPHYSICS AND SPACE SCIENCE, 2016, 361 (09)
  • [5] Stellar Spectra Classification with Entropy-Based Learning Machine
    Liu Zhong-bao
    Ren Juan-juan
    Song Wen-ai
    Zhang Jing
    Kong Xiao
    Fu Li-zhen
    [J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2018, 38 (02) : 660 - 664
  • [6] Automatic spectral classification of stellar spectra with low signal-to-noise ratio using artificial neural networks
    Navarro, S. G.
    Corradi, R. L. M.
    Mampaso, A.
    [J]. ASTRONOMY & ASTROPHYSICS, 2012, 538
  • [7] A comparison of generalized linear discriminant analysis algorithms
    Park, Cheong Hee
    Park, Haesun
    [J]. PATTERN RECOGNITION, 2008, 41 (03) : 1083 - 1097
  • [8] Nonlinear dimensionality reduction by locally linear embedding
    Roweis, ST
    Saul, LK
    [J]. SCIENCE, 2000, 290 (5500) : 2323 - +
  • [9] SHI Jian-rong, 2016, CHINESE SCI BULL, V61, P1330
  • [10] Vincent Pascal, 2008, P INT C MACH LEARN I