Comparing Generative AI Literature Reviews Versus Human-Led Systematic Literature Reviews: A Case Study on Big Data Research

被引:0
|
作者
Tosi, Davide [1 ]
机构
[1] Univ Insubria, Dept Theoret & Appl Sci, I-20110 Varese, Italy
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Big Data; Artificial intelligence; Real-time systems; Accuracy; Manuals; Generative AI; Finance; Scalability; AI-assisted research; big data; generative AI; large language models; systematic literature review; MANAGEMENT;
D O I
10.1109/ACCESS.2025.3554504
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) are transforming research methodologies, including Systematic Literature Reviews (SLRs). While traditional, human-led SLRs are labor-intensive, AI-driven approaches promise efficiency and scalability. However, the reliability and accuracy of AI-generated literature reviews remain uncertain. This study investigates the performance of GPT-4-powered Consensus in conducting an SLR on Big Data research, comparing its results with a manually conducted SLR. To evaluate Consensus, we analyzed its ability to detect relevant studies, extract key insights, and synthesize findings. Our human-led SLR identified 32 primary studies (PSs) and 207 related works, whereas Consensus detected 22 PSs, with 16 overlapping with the manual selection and 5 false positives. The AI-selected studies had an average citation count of 202 per study, significantly higher than the 64.4 citations per study in the manual SLR, indicating a possible bias toward highly cited papers. However, none of the 32 PSs selected manually were included in the AI-generated results, highlighting recall and selection accuracy limitations. Key findings reveal that Consensus accelerates literature retrieval but suffers from hallucinations, reference inaccuracies, and limited critical analysis. Specifically, it failed to capture nuanced research challenges and missed important application domains. Precision, recall, and F1 scores of the AI-selected studies were 76.2%, 38.1%, and 50.6%, respectively, demonstrating that while AI retrieves relevant papers with high precision, it lacks comprehensiveness. To mitigate these limitations, we propose a hybrid AI-human SLR framework, where AI enhances search efficiency while human reviewers ensure rigor and validity. While AI can support literature reviews, human oversight remains essential for ensuring accuracy and depth. Future research should assess AI-assisted SLRs across multiple disciplines to validate generalizability and explore domain-specific LLMs for improved performance.
引用
收藏
页码:56210 / 56219
页数:10
相关论文
共 50 条
  • [31] ADVANCING SYSTEMATIC LITERATURE REVIEWS: THE INTEGRATION OF AI-POWERED NLP MODELS IN DATA COLLECTION PROCESSES
    Rai, P.
    Kaur, R.
    Pandey, S.
    Attri, S.
    Kaur, G.
    Singh, B.
    VALUE IN HEALTH, 2024, 27 (06) : S270 - S270
  • [32] Completeness of reporting of systematic reviews in the animal health literature: A meta-research study
    Sargeant, Jan M.
    Reynolds, Kristen
    Winder, Charlotte B.
    O'Connor, Annette M.
    PREVENTIVE VETERINARY MEDICINE, 2021, 195
  • [33] GENERATIVE ARTIFICIAL INTELLIGENCE: AN EFFECTIVE ALTERNATIVE FOR SCREENING TITLES AND ABSTRACTS IN SYSTEMATIC LITERATURE REVIEWS
    Abogunrin, S.
    Sieiro, R. R.
    Lane, M.
    VALUE IN HEALTH, 2024, 27 (12)
  • [34] PERFORMANCE OF AUTOMATED SCREENING OF CITATIONS COMPARED TO HUMAN REVIEWERS IN SYSTEMATIC LITERATURE REVIEWS: A SYSTEMATIC LITERATURE REVIEW
    Lopes, R.
    Gauthier, G.
    Akhtar, O.
    Atanasov, P.
    VALUE IN HEALTH, 2018, 21 : S367 - S367
  • [35] Enhancing systematic literature reviews with generative artificial intelligence: development, applications, and performance evaluation
    Li, Ying
    Datta, Surabhi
    Rastegar-Mojarad, Majid
    Lee, Kyeryoung
    Paek, Hunki
    Glasgow, Julie
    Liston, Chris
    He, Long
    Wang, Xiaoyan
    Xu, Yingxin
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2025,
  • [36] Systematic literature reviews in agile software development: A tertiary study
    Hoda, Rashina
    Salleh, Norsaremah
    Grundy, John
    Tee, Hui Mien
    INFORMATION AND SOFTWARE TECHNOLOGY, 2017, 85 : 60 - 70
  • [37] Systematic literature reviews in cyberbullying/cyber harassment: A tertiary study
    Saleem, Sumera
    Khan, Naurin Farooq
    Zafar, Saad
    Raza, Najla
    TECHNOLOGY IN SOCIETY, 2022, 70
  • [38] Systematic Literature Reviews of Software Process Improvement: A Tertiary Study
    Khan, Arif Ali
    Keung, Jacky
    Niazi, Mahmood
    Hussain, Shahid
    Zhang, He
    SYSTEMS, SOFTWARE AND SERVICES PROCESS IMPROVEMENT (EUROSPI 2017), 2017, 748 : 177 - 190
  • [39] Effective implementation of research into practice: An overview of systematic reviews of the health literature
    Boaz A.
    Baeza J.
    Fraser A.
    BMC Research Notes, 4 (1)
  • [40] Knowledge Distribution on Smart City Research: An Overview of Systematic Literature Reviews
    Pratama, Arif Budy
    JOURNAL OF URBAN CULTURE RESEARCH, 2021, 23 : 25 - 43