Protostellar classification using supervised machine learning algorithms

被引:0
|
作者
O. Miettinen
机构
[1] Digia Plc/Avarea Oy,
来源
关键词
Methods: data analysis; Stars: formation; Stars: protostars;
D O I
暂无
中图分类号
学科分类号
摘要
Classification of young stellar objects (YSOs) into different evolutionary stages helps us to understand the formation process of new stars and planetary systems. Such classification has traditionally been based on spectral energy distribution (SED) analysis. An alternative approach is provided by supervised machine learning algorithms, which can be trained to classify large samples of YSOs much faster than via SED analysis. We attempt to classify a sample of Orion YSOs (the parent sample size is 330) into different classes, where each source has already been classified using multiwavelength SED analysis. We used eight different learning algorithms to classify the target YSOs, namely a decision tree, random forest, gradient boosting machine (GBM), logistic regression, naïve Bayes classifier, k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$k$\end{document}-nearest neighbour classifier, support vector machine, and neural network. The classifiers were trained and tested by using a 10-fold cross-validation procedure. As the learning features, we employed ten different continuum flux densities spanning from the near-infrared to submillimetre wavebands (λ=3.6–870μm\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\lambda= 3.6\mbox{--}870~\upmu\mbox{m}$\end{document}). With a classification accuracy of 82% (with respect to the SED-based classes), a GBM algorithm was found to exhibit the best performance. The lowest accuracy of 47% was obtained with a naïve Bayes classifier. Our analysis suggests that the inclusion of the 3.6μm\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$3.6~\upmu\mbox{m}$\end{document} and 24μm\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$24~\upmu\mbox{m}$\end{document} flux densities is useful to maximise the YSO classification accuracy. Although machine learning has the potential to provide a rapid and fairly reliable way to classify YSOs, an SED analysis is still needed to derive the physical properties of the sources (e.g. dust temperature and mass), and to create the labelled training data. The machine learning classification accuracies can be improved with respect to the present results by using larger data sets, more detailed missing value imputation, and advanced ensemble methods (e.g. extreme gradient boosting). Overall, the application of machine learning is expected to be very useful in the era of big astronomical data, for example to quickly assemble interesting target source samples for follow-up studies.
引用
收藏
相关论文
共 50 条
  • [1] Protostellar classification using supervised machine learning algorithms
    Miettinen, O.
    [J]. ASTROPHYSICS AND SPACE SCIENCE, 2018, 363 (09)
  • [2] Text Message Classification Using Supervised Machine Learning Algorithms
    Merugu, Suresh
    Reddy, M. Chandra Shekhar
    Goyal, Ekansh
    Piplani, Lakshay
    [J]. ICCCE 2018, 2019, 500 : 141 - 150
  • [3] Classification of Space Particle Events using Supervised Machine Learning Algorithms
    Saric, Rijad
    Chen, Junchao
    Krstic, Milos
    Custovic, Edhem
    Panic, Goran
    Kevric, Jasmin
    Jokic, Dejan
    [J]. 2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
  • [4] Hindi Poetry Classification using Eager Supervised Machine Learning Algorithms
    Bafna, Prafulla
    Saini, Jatinderkumar R.
    [J]. 2020 INTERNATIONAL CONFERENCE ON EMERGING SMART COMPUTING AND INFORMATICS (ESCI), 2020, : 175 - 178
  • [5] Diabetic retinopathy classification for supervised machine learning algorithms
    Nakayama, Luis Filipe
    Ribeiro, Lucas Zago
    Goncalves, Mariana Batista
    Ferraz, Daniel A.
    dos Santos, Helen Nazareth Veloso
    Malerbi, Fernando Korn
    Morales, Paulo Henrique
    Maia, Mauricio
    Regatieri, Caio Vinicius Saito
    Mattos, Rubens Belfort, Jr.
    [J]. INTERNATIONAL JOURNAL OF RETINA AND VITREOUS, 2022, 8 (01)
  • [6] Diabetic retinopathy classification for supervised machine learning algorithms
    Luis Filipe Nakayama
    Lucas Zago Ribeiro
    Mariana Batista Gonçalves
    Daniel A. Ferraz
    Helen Nazareth Veloso dos Santos
    Fernando Korn Malerbi
    Paulo Henrique Morales
    Mauricio Maia
    Caio Vinicius Saito Regatieri
    Rubens Belfort Mattos
    [J]. International Journal of Retina and Vitreous, 8
  • [7] Supervised machine learning algorithms for protein structure classification
    Jain, Pooja
    Garibaldi, Jonathan M.
    Hirst, Jonathan D.
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2009, 33 (03) : 216 - 223
  • [8] Automatic Product Classification Using Supervised Machine Learning Algorithms in Price Statistics
    Oancea, Bogdan
    [J]. MATHEMATICS, 2023, 11 (07)
  • [9] Email Classification Using Supervised Learning Algorithms
    Bhadra, Akshay
    Hitawala, Saifuddin
    Modi, Ruchit
    Salunkhe, Suraj
    [J]. PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, VOL 2, 2018, 564 : 81 - 90
  • [10] Performance Analysis of Supervised Machine Learning Algorithms for Text Classification
    Mishu, Sadia Zaman
    Rafiuddin, S. M.
    [J]. PROCEEDINGS OF THE 2016 19TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2016, : 409 - 413