Testing the reliability of an AI-based large language model to extract ecological information from the scientific literature

被引:8
|
作者
Andrew V. Gougherty
Hannah L. Clipp
机构
[1] USDA Forest Service Northern Research Station,
来源
npj Biodiversity | / 3卷 / 1期
关键词
D O I
10.1038/s44185-024-00043-9
中图分类号
学科分类号
摘要
Artificial intelligence-based large language models (LLMs) have the potential to substantially improve the efficiency and scale of ecological research, but their propensity for delivering incorrect information raises significant concern about their usefulness in their current state. Here, we formally test how quickly and accurately an LLM performs in comparison to a human reviewer when tasked with extracting various types of ecological data from the scientific literature. We found the LLM was able to extract relevant data over 50 times faster than the reviewer and had very high accuracy (>90%) in extracting discrete and categorical data, but it performed poorly when extracting certain quantitative data. Our case study shows that LLMs offer great potential for generating large ecological databases at unprecedented speed and scale, but additional quality assurance steps are required to ensure data integrity.
引用
收藏
相关论文
共 50 条
  • [1] Large language model, AI and scientific research
    Daungsupawong, Hinpetch
    Wiwanitkit, Viroj
    JOURNAL OF NEUROSURGICAL SCIENCES, 2024, 68 (04) : 500 - 500
  • [2] Opportunities and risks of using AI-based language systems in the creation of scientific work
    Pepper, Niklas Benedikt
    Kroeger, Kai
    Oertel, Michael
    Rehn, Stephan
    Rolf, Daniel
    Eich, Hans Theodor
    STRAHLENTHERAPIE UND ONKOLOGIE, 2023, 199 : S68 - S68
  • [3] Potential applications of innovative AI-based tools in hydrogen energy development: Leveraging large language model technologies
    Shahin, Matin
    Simjoo, Mohammad
    INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2025, 102 : 918 - 936
  • [4] AI-Based Enhancement of Test Models in an Industrial Model-Based Testing Tool
    Mohacsi, Stefan
    Felderer, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 636 - 638
  • [5] Large language model, AI and scientific research: why ChatGPT is only the beginning
    Zangrossi, Pietro
    Martini, Massimo
    Guerrini, Francesco
    De Bonis, Pasquale
    Spena, Giannantonio
    JOURNAL OF NEUROSURGICAL SCIENCES, 2024, 68 (02) : 216 - 224
  • [6] Investigations on Scientific Literature Meta Information Extraction Using Large Language Models
    Guo, Menghao
    Wu, Fan
    Jiang, Jinling
    Yan, Xiaoran
    Chen, Guangyong
    Li, Wenhui
    Zhao, Yunhong
    Sun, Zeyi
    2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG, 2023, : 249 - 254
  • [7] GAMEDX: GENERATIVE AI-BASED MEDICAL ENTITY DATA EXTRACTOR USING LARGE LANGUAGE MODELS
    Ghali, Mohammed-Khalil
    Farrag, Abdelrahman
    Sakai, Hajar
    Baz, Hicham El
    Jin, Yu
    Lam, Sarah
    arXiv,
  • [8] Feature Analysis for Detecting Mobile Application Review Generated by AI-Based Language Model
    Lee, Seung-Cheol
    Jang, Yonghun
    Park, Chang-Hyeon
    Seo, Yeong-Seok
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2022, 18 (05): : 650 - 664
  • [9] AI-Based Knowledge Extraction from the Bioprinting Literature for Identifying Technology Trends
    Bonatti, Amedeo Franco
    Chiarello, Filippo
    Vozzi, Giovanni
    De Maria, Carmelo
    3D PRINTING AND ADDITIVE MANUFACTURING, 2024, 11 (04) : 1495 - 1509
  • [10] Information loss challenges in surgical navigation systems: From information fusion to AI-based approaches
    Xu, Lisheng
    Zhang, Haoran
    Wang, Jiaole
    Li, Ang
    Song, Shuang
    Ren, Hongliang
    Qi, Lin
    Gu, Jason J.
    Meng, Max Q. -H.
    INFORMATION FUSION, 2023, 92 : 13 - 36