Poisoning medical knowledge using large language models

被引:0
|
作者
Yang, Junwei [1 ]
Xu, Hanwen [2 ]
Mirzoyan, Srbuhi [1 ]
Chen, Tong [2 ]
Liu, Zixuan [2 ]
Liu, Zequn [1 ]
Ju, Wei [1 ]
Liu, Luchen [1 ]
Xiao, Zhiping [2 ]
Zhang, Ming [1 ]
Wang, Sheng [2 ]
机构
[1] Peking Univ, Sch Comp Sci, Anker Embodied AI Lab, State Key Lab Multimedia Informat Proc, Beijing, Peoples R China
[2] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1038/s42256-024-00899-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Biomedical knowledge graphs (KGs) constructed from medical literature have been widely used to validate biomedical discoveries and generate new hypotheses. Recently, large language models (LLMs) have demonstrated a strong ability to generate human-like text data. Although most of these text data have been useful, LLM might also be used to generate malicious content. Here, we investigate whether it is possible that a malicious actor can use an LLM to generate a malicious paper that poisons medical KGs and further affects downstream biomedical applications. As a proof of concept, we develop Scorpius, a conditional text-generation model that generates a malicious paper abstract conditioned on a promoted drug and a target disease. The goal is to fool the medical KG constructed from a mixture of this malicious abstract and millions of real papers so that KG consumers will misidentify this promoted drug as relevant to the target disease. We evaluated Scorpius on a KG constructed from 3,818,528 papers and found that Scorpius can increase the relevance of 71.3% drug-disease pairs from the top 1,000 to the top ten by adding only one malicious abstract. Moreover, the generation of Scorpius achieves better perplexity than ChatGPT, suggesting that such malicious abstracts cannot be efficiently detected by humans. Collectively, Scorpius demonstrates the possibility of poisoning medical KGs and manipulating downstream applications using LLMs, indicating the importance of accountable and trustworthy medical knowledge discovery in the era of LLMs.
引用
收藏
页码:1156 / 1168
页数:13
相关论文
共 50 条
  • [1] A medical question answering system using large language models and knowledge graphs
    Guo, Quan
    Cao, Shuai
    Yi, Zhang
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (11) : 8548 - 8564
  • [2] Medical large language models are vulnerable to data-poisoning attacks
    Daniel Alexander Alber
    Zihao Yang
    Anton Alyakin
    Eunice Yang
    Sumedha Rai
    Aly A. Valliani
    Jeff Zhang
    Gabriel R. Rosenbaum
    Ashley K. Amend-Thomas
    David B. Kurland
    Caroline M. Kremer
    Alexander Eremiev
    Bruck Negash
    Daniel D. Wiggan
    Michelle A. Nakatsuka
    Karl L. Sangwon
    Sean N. Neifert
    Hammad A. Khan
    Akshay Vinod Save
    Adhith Palla
    Eric A. Grin
    Monika Hedman
    Mustafa Nasir-Moin
    Xujin Chris Liu
    Lavender Yao Jiang
    Michal A. Mankowski
    Dorry L. Segev
    Yindalon Aphinyanaphongs
    Howard A. Riina
    John G. Golfinos
    Daniel A. Orringer
    Douglas Kondziolka
    Eric Karl Oermann
    Nature Medicine, 2025, 31 (2) : 618 - 626
  • [3] Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination
    Liu, Mingxin
    Okuhara, Tsuyoshi
    Dai, Zhehao
    Huang, Wenbo
    Gu, Lin
    Okada, Hiroko
    Furukawa, Emi
    Kiuchi, Takahiro
    International Journal of Medical Informatics, 2025, 193
  • [4] Workshop on Enterprise Knowledge Graphs using Large Language Models
    Gupta, Rajeev
    Srinivasa, Srinath
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 5271 - 5272
  • [5] Benchmarking medical large language models
    Sadra Bakhshandeh
    Nature Reviews Bioengineering, 2023, 1 (8): : 543 - 543
  • [6] Knowledge Synthesis using Large Language Models for a Computational BiologyWorkflow Ecosystem
    Jamil, Hasan M.
    Krawetz, Stephen
    Gow, Alexander
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 523 - 530
  • [7] Evaluating the quality of medical content on YouTube using large language models
    Mahmoud Khalil
    Fatma Mohamed
    Abdulhadi Shoufan
    Scientific Reports, 15 (1)
  • [8] Quantifying Domain Knowledge in Large Language Models
    Sayenju, Sudhashree
    Aygun, Ramazan
    Franks, Bill
    Johnston, Sereres
    Lee, George
    Choi, Hansook
    Modgil, Girish
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 193 - 194
  • [9] Large language models encode clinical knowledge
    Singhal, Karan
    Azizi, Shekoofeh
    Tu, Tao
    Mahdavi, S. Sara
    Wei, Jason
    Chung, Hyung Won
    Scales, Nathan
    Tanwani, Ajay
    Cole-Lewis, Heather
    Pfohl, Stephen
    Payne, Perry
    Seneviratne, Martin
    Gamble, Paul
    Kelly, Chris
    Babiker, Abubakr
    Schaerli, Nathanael
    Chowdhery, Aakanksha
    Mansfield, Philip
    Demner-Fushman, Dina
    Arcas, Blaise Aguera y
    Webster, Dale
    Corrado, Greg S.
    Matias, Yossi
    Chou, Katherine
    Gottweis, Juraj
    Tomasev, Nenad
    Liu, Yun
    Rajkomar, Alvin
    Barral, Joelle
    Semturs, Christopher
    Karthikesalingam, Alan
    Natarajan, Vivek
    NATURE, 2023, 620 (7972) : 172 - +
  • [10] Large language models encode clinical knowledge
    Karan Singhal
    Shekoofeh Azizi
    Tao Tu
    S. Sara Mahdavi
    Jason Wei
    Hyung Won Chung
    Nathan Scales
    Ajay Tanwani
    Heather Cole-Lewis
    Stephen Pfohl
    Perry Payne
    Martin Seneviratne
    Paul Gamble
    Chris Kelly
    Abubakr Babiker
    Nathanael Schärli
    Aakanksha Chowdhery
    Philip Mansfield
    Dina Demner-Fushman
    Blaise Agüera y Arcas
    Dale Webster
    Greg S. Corrado
    Yossi Matias
    Katherine Chou
    Juraj Gottweis
    Nenad Tomasev
    Yun Liu
    Alvin Rajkomar
    Joelle Barral
    Christopher Semturs
    Alan Karthikesalingam
    Vivek Natarajan
    Nature, 2023, 620 : 172 - 180