Improving the performance of machine learning penicillin adverse drug reaction classification with synthetic data and transfer learning

被引:0
|
作者
Stanekova, Viera [1 ,2 ]
Inglis, Joshua M. [1 ,2 ]
Lam, Lydia [1 ,2 ]
Lam, Antoinette [1 ,2 ]
Smith, William [1 ,2 ]
Shakib, Sepehr [1 ,2 ]
Bacchi, Stephen [1 ,2 ]
机构
[1] Royal Adelaide Hosp, Adelaide, SA 5000, Australia
[2] Univ Adelaide, Adelaide, SA, Australia
关键词
natural language processing; artificial intelligence; delabelling; ALLERGY;
D O I
10.1111/imj.16360
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
BackgroundMachine learning may assist with the identification of potentially inappropriate penicillin allergy labels. Strategies to improve the performance of existing models for this task include the use of additional training data, synthetic data and transfer learning.AimsThe aims of this study were to investigate the use of additional training data and novel machine learning strategies, namely synthetic data and transfer learning, to improve the performance of penicillin adverse drug reaction (ADR) machine learning classification.MethodsMachine learning natural language processing was applied to free-text penicillin ADR data extracted from a public health system electronic health record (EHR). The models were developed by training on various labelled data sets. ADR entries were split into training and testing data sets and used to develop and test a variety of machine learning models. The effect of training on additional data and synthetic data versus the use of transfer learning was analysed.ResultsFollowing the application of these techniques, the area under the receiver operator curve of best-performing models for the classification of penicillin allergy (vs intolerance) and high-risk allergy (vs low-risk allergy) improved to 0.984 (using the artificial neural network model) and 0.995 (with the transfer learning approach) respectively.ConclusionsMachine learning models demonstrate high levels of accuracy in the classification and risk stratification of penicillin ADR labels using the reaction documented in the EHR. The model can be further optimised by incorporating additional training data and using transfer learning. Practical applications include automating case detection for penicillin allergy delabelling programmes.
引用
收藏
页码:1183 / 1189
页数:7
相关论文
共 50 条
  • [1] Machine learning models automate classification of penicillin adverse drug reaction labels
    Inglis, Joshua M.
    Bacchi, Stephen
    Troelnikov, Alexander
    Smith, William
    Shakib, Sepehr
    [J]. INTERNAL MEDICINE JOURNAL, 2023, 53 (08) : 1485 - 1488
  • [2] A survey on adverse drug reaction studies: data, tasks and machine learning methods
    Duc Anh Nguyen
    Canh Hao Nguyen
    Mamitsuka, Hiroshi
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (01) : 164 - 177
  • [3] Improving classification of Adverse Drug Reactions through Using Sentiment Analysis and Transfer Learning
    Alhuzali, Hassan
    Ananiadou, Sophia
    [J]. SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2019), 2019, : 339 - 347
  • [4] Improving Transfer Learning Performance: An Application in the Classification of Remote Sensing Data
    Tenorio, Gabriel Lins
    Munoz Villalobos, Cristian E.
    Forero Mendoza, Leonardo A.
    da Silva, Eduardo Costa
    Caarls, Wouter
    [J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 174 - 183
  • [5] Extreme learning machine based transfer learning for data classification
    Li, Xiaodong
    Mao, Weijie
    Jiang, Wei
    [J]. NEUROCOMPUTING, 2016, 174 : 203 - 210
  • [6] Automation of penicillin adverse drug reaction categorisation and risk stratification with machine learning natural language processing
    Inglis, Joshua M.
    Bacchi, Stephen
    Troelnikov, Alexander
    Smith, William
    Shakib, Sepehr
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 156
  • [7] Detecting Adverse Drug Reaction with Data Mining And Predicting its Severity With Machine Learning
    Islam, Tanvir
    Hussain, Nadib
    Islam, Samiul
    Chakrabarty, Amitabha
    [J]. 2018 IEEE REGION 10 HUMANITARIAN TECHNOLOGY CONFERENCE (R10-HTC), 2018,
  • [8] Machine learning-based methods and novel data models to predict adverse drug reaction
    Wang, Jinxian
    Deng, Yuanyuan
    Shu, Liang
    Deng, Lei
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 1226 - 1230
  • [9] Transfer Learning based Data-Efficient Machine Learning Enabled Classification
    Niu, Shuteng
    Wang, Jian
    Liu, Yongxin
    Song, Houbing
    [J]. 2020 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2020, : 620 - 626
  • [10] Improving machine learning performance on small chemical reaction data with unsupervised contrastive pretraining
    Wen, Mingjian
    Blau, Samuel M.
    Xie, Xiaowei
    Dwaraknath, Shyam
    Persson, Kristin A.
    [J]. CHEMICAL SCIENCE, 2022, 13 (05) : 1446 - 1458