Knowledge discovery through text-based similarity searches for astronomy literature

被引:6
|
作者
Kerzendorf, Wolfgang E. [1 ,2 ]
机构
[1] NYU, Ctr Cosmol & Particle Phys, 726 Broadway, New York, NY 10003 USA
[2] European Southern Observ, Karl Schwarzschild Str 2, D-85748 Garching, Germany
关键词
Natural language processing; methods: statistical;
D O I
10.1007/s12036-019-9590-5
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
The increase in the number of researchers coupled with the ease of publishing and distribution of scientific papers (due to technological advancements) has resulted in a dramatic increase in astronomy literature. This has likely led to the predicament that the body of the literature is too large for traditional human consumption and that related and crucial knowledge is not discovered by researchers. In addition to the increased production of astronomical literature, recent decades have also brought several advancements in computational linguistics. Especially, the machine-aided processing of literature dissemination might make it possible to convert this stream of papers into a coherent knowledge set. In this paper, we present the application of computational linguistics techniques to astronomy literature. In particular, we developed a tool that will find similar articles purely based on text content f rom an input paper. We find that our technique performs robustly in comparison with other tools recommending articles given a reference paper (known as recommender system). Our novel tool shows great power in combining computational linguistics with astronomy literature and suggests that additional research in this endeavor will likely produce even better tools that will help researchers cope with vast amounts of knowledge being produced.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] ONLINE PROCESSING OF PREEXISTING KNOWLEDGE MISCONCEPTIONS AND TEXT-BASED INCONSISTENCIES
    PRINZO, OV
    DANKS, JH
    [J]. BULLETIN OF THE PSYCHONOMIC SOCIETY, 1987, 25 (05) : 350 - 350
  • [22] Validation: Knowledge- and Text-Based Monitoring During Reading
    van Moort, Marianne L.
    Koornneef, Arnout
    van den Broek, Paul W.
    [J]. DISCOURSE PROCESSES, 2018, 55 (5-6) : 480 - 496
  • [23] Personal Knowledge Base Construction from Text-based Lifelogs
    Yen, An-Zi
    Huang, Hen-Hsen
    Chen, Hsin-Hsi
    [J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 185 - 194
  • [24] Text-Based Chatbot in Financial Sector: A Systematic Literature Review
    Wube, Hana Demma
    Esubalew, Sintayehu Zekarias
    Weldesellasie, Firesew Fayiso
    Debelee, Taye Girma
    [J]. DATA SCIENCE IN FINANCE AND ECONOMICS, 2022, 2 (03): : 209 - 236
  • [25] Comparison of Text-Based and Feature-Based Semantic Similarity Between Android Apps
    Uddin, Md Kafil
    He, Qiang
    Han, Jun
    Chua, Caslon
    [J]. WEB INFORMATION SYSTEMS ENGINEERING, WISE 2020, PT I, 2020, 12342 : 530 - 545
  • [26] Comparison of text-based and linked-based metrics in terms of estimating the similarity of articles
    Goltaji, Marzieh
    Abbaspour, Javad
    Jowkar, Abdolrasool
    Fakhrahmad, Seyed Mostafa
    [J]. JOURNAL OF LIBRARIANSHIP AND INFORMATION SCIENCE, 2024, 56 (03) : 760 - 772
  • [27] VizCommender: Computing Text-Based Similarity in Visualization Repositories for Content-Based Recommendations
    Oppermann, Michael
    Kincaid, Robert
    Munzner, Tamara
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) : 495 - 505
  • [28] Text-based similarity searching for hit- and lead-candidate identification
    Volker Hähnke
    [J]. Journal of Cheminformatics, 4 (Suppl 1)
  • [29] Testing the model-observer similarity hypothesis with text-based worked examples
    Hoogerheide, Vincent
    Loyens, Sofie M. M.
    Jadi, Fedora
    Vrins, Anna
    van Gog, Tamara
    [J]. EDUCATIONAL PSYCHOLOGY, 2017, 37 (02) : 112 - 127
  • [30] Pitfalls in users' evaluation of algorithms for text-based similarity detection in medical education
    Scavnicky, Jakub
    Karolyi, Matej
    Ruzickova, Petra
    Pokorna, Andrea
    Harazim, Hana
    Stourac, Petr
    Komenda, Martin
    [J]. PROCEEDINGS OF THE 2018 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2018, : 109 - 116