Data mining techniques on astronomical spectra data - II. Classification analysis

被引:15
|
作者
Yang, Haifeng [1 ]
Zhou, Lichan [1 ]
Cai, Jianghui [1 ,2 ]
Shi, Chenhui [1 ]
Yang, Yuqing [1 ]
Zhao, Xujun [1 ]
Duan, Juncheng [1 ]
Yin, Xiaona [1 ]
机构
[1] Taiyuan Univ Sci & Technol, Sch Comp Sci & Technol, Taiyuan 030024, Peoples R China
[2] North Univ China, Sch Comp Sci & Technol, Taiyuan 030051, Peoples R China
基金
中国国家自然科学基金;
关键词
methods: data analysis; techniques: spectroscopic; software: data analysis; MACHINE-LEARNING APPROACH; DECISION TREE; NEURAL-NETWORKS; LAMOST; STARS; IDENTIFICATION; SUBDWARFS; GALAXIES; OBJECTS; SEARCH;
D O I
10.1093/mnras/stac3292
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
Classification is valuable and necessary in spectral analysis, especially for data-driven mining. Along with the rapid development of spectral surveys, a variety of classification techniques have been successfully applied to astronomical data processing. However, it is difficult to select an appropriate classification method in practical scenarios due to the different algorithmic ideas and data characteristics. Here, we present the second work in the data mining series - a review of spectral classification techniques. This work also consists of three parts: a systematic overview of current literature, experimental analyses of commonly used classification algorithms, and source codes used in this paper. First, we carefully investigate the current classification methods in astronomical literature and organize these methods into ten types based on their algorithmic ideas. For each type of algorithm, the analysis is organized from the following three perspectives. (1) their current applications and usage frequencies in spectral classification are summarized; (2) their basic ideas are introduced and preliminarily analysed; (3) the advantages and caveats of each type of algorithm are discussed. Secondly, the classification performance of different algorithms on the unified data sets is analysed. Experimental data are selected from the LAMOST survey and SDSS survey. Six groups of spectral data sets are designed from data characteristics, data qualities, and data volumes to examine the performance of these algorithms. Then the scores of nine basic algorithms are shown and discussed in the experimental analysis. Finally, nine basic algorithms source codes written in python and manuals for usage and improvement are provided.
引用
收藏
页码:5904 / 5928
页数:25
相关论文
共 50 条
  • [1] Data mining techniques on astronomical spectra data - I. Clustering analysis
    Yang, Haifeng
    Shi, Chenhui
    Cai, Jianghui
    Zhou, Lichan
    Yang, Yuqing
    Zhao, Xujun
    He, Yanting
    Hao, Jing
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2022, 517 (04) : 5496 - 5523
  • [2] Data mining techniques on astronomical spectra data - III. Association analysis
    Cai, Jianghui
    Zhang, Mingxing
    Yang, Haifeng
    Shi, Chenhui
    Zhou, Lichan
    He, Yanting
    Su, Meihong
    Zhao, Xujun
    Chen, Jiongyu
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2024, 532 (01) : 223 - 240
  • [3] COMPARATIVE ANALYSIS OF DATA MINING TECHNIQUES FOR MEDICAL DATA CLASSIFICATION
    Lashari, S. A.
    Ibrahim, R.
    [J]. COMPUTING & INFORMATICS, 4TH INTERNATIONAL CONFERENCE, 2013, 2013, : 365 - 370
  • [4] On the classification techniques in data mining for microarray data classification
    Aydadenta, Husna
    Adiwijaya
    [J]. INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE (ICODIS), 2018, 971
  • [5] Mining astronomical data
    Voisin, B
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, 2001, 2113 : 621 - 631
  • [6] A Study On Classification Techniques in Data Mining
    Kesavaraj, G.
    Sukumaran, S.
    [J]. 2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [7] Survey of Classification Techniques in Data Mining
    Phyu, Thair Nu
    [J]. IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 727 - 731
  • [8] Performance Analysis of Data Mining Classification Techniques to Predict Diabetes
    Perveen, Sajida
    Shahbaz, Muhammad
    Guergachi, Aziz
    Keshavjee, Karim
    [J]. 4TH SYMPOSIUM ON DATA MINING APPLICATIONS (SDMA2016), 2016, 82 : 115 - 121
  • [9] Appropriate medical data categorization for data mining classification techniques
    Liao, SC
    Lee, IN
    [J]. MEDICAL INFORMATICS AND THE INTERNET IN MEDICINE, 2002, 27 (01): : 59 - 67
  • [10] Data mining in astronomical databases
    Borne, KD
    [J]. MINING THE SKY, 2001, : 671 - 673