Classification of Astrophysics Journal Articles with Machine Learning to Identify Data for NED

被引:1
|
作者
Chen, Tracy X. [1 ]
Ebert, Rick [1 ]
Mazzarella, Joseph M. [1 ]
Frayer, Cren [1 ]
Terek, Scott [1 ]
Chan, Ben H. P. [1 ]
Cook, David [1 ]
Lo, Tak [1 ]
Schmitz, Marion [1 ]
Wu, Xiuqin [1 ]
机构
[1] CALTECH, IPAC NED, Mail Code 100-22,1200 E Calif Blvd, Pasadena, CA 91125 USA
基金
美国国家航空航天局;
关键词
Astronomy databases; Classification;
D O I
10.1088/1538-3873/ac3c36
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
The NASA/IPAC Extragalactic Database (NED) is a comprehensive online service that combines fundamental multi-wavelength information for known objects beyond the Milky Way and provides value-added, derived quantities and tools to search and access the data. The contents and relationships between measurements in the database are continuously augmented and revised to stay current with astrophysics literature and new sky surveys. The conventional process of distilling and extracting data from the literature involves human experts to review the journal articles and determine if an article is of extragalactic nature, and if so, what types of data it contains. This is both labor intensive and unsustainable, especially given the ever-increasing number of publications each year. We present here a machine learning (ML) approach developed and integrated into the NED production pipeline to help automate the classification of journal article topics and their data content for inclusion into NED. We show that this ML application can successfully reproduce the classifications of a human expert to an accuracy of over 90% in a fraction of the time it takes a human, allowing us to focus human expertise on tasks that are more difficult to automate.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Classification of Astrophysics Journal Articles with Machine Learning to Identify Data for NED
    Chen, Tracy X.
    Ebert, Rick
    Mazzarella, Joseph M.
    Frayer, Cren
    Terek, Scott
    Chan, Ben H.P.
    Cook, David
    Lo, Tak
    Schmitz, Marion
    Wu, Xiuqin
    [J]. arXiv, 2022,
  • [2] Machine learning applied to multifrequency data in astrophysics: blazar classification
    Arsioli, B.
    Dedin, P.
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2020, 498 (02) : 1750 - 1764
  • [3] Fine-grained classification of social science journal articles using textual data: A comparison of supervised machine learning approaches
    Eykens, Joshua
    Guns, Raf
    Engels, Tim C. E.
    [J]. QUANTITATIVE SCIENCE STUDIES, 2021, 2 (01): : 89 - 110
  • [4] Lubrication Regime Classification of Hydrodynamic Journal Bearings by Machine Learning Using Torque Data
    Moder, Jakob
    Bergmann, Philipp
    Gruen, Florian
    [J]. LUBRICANTS, 2018, 6 (04):
  • [5] Can the quality of published academic journal articles be assessed with machine learning?
    Thelwell, Mike
    [J]. QUANTITATIVE SCIENCE STUDIES, 2022, 3 (01): : 208 - 226
  • [6] Using Machine Learning Method to Identify for Frog Classification
    Chao, Kuo-Wei
    Chao, Yi-Chu
    Su, Chin-Kai
    Hu, Nian-Ze
    Chiu, Wei-Hang
    [J]. PROCEEDINGS OF THE 2019 IEEE EURASIA CONFERENCE ON IOT, COMMUNICATION AND ENGINEERING (ECICE), 2019, : 168 - 171
  • [7] A Machine Learning Based Method for Optimal Journal Classification
    Iqbal, Saeed
    Shaheen, Muhammad
    Fazl-e-Basit
    [J]. 2013 8TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2013, : 259 - 264
  • [8] On Machine Learning Classification of Otoneurological Data
    Juhola, Martti
    [J]. EHEALTH BEYOND THE HORIZON - GET IT THERE, 2008, 136 : 211 - 216
  • [9] Classification of legal articles based on bio ethics related to machine learning
    Zhou, Liangliang
    [J]. Journal of Commercial Biotechnology, 2021, 26 (04) : 171 - 178
  • [10] AN EMPIRICAL STUDY ON THE CLASSIFICATION OF CHINESE NEWS ARTICLES BY MACHINE LEARNING AND DEEP LEARNING TECHNIQUES
    Huang, Chuen-Min
    Jiang, Yi-Jun
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 462 - 467