Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects

被引:2
|
作者
Liu, Mengmeng [1 ]
Srivastava, Gopal [2 ]
Ramanujam, J. [1 ,3 ]
Brylinski, Michal [2 ,3 ]
机构
[1] Louisiana State Univ, Div Elect & Comp Engn, Baton Rouge, LA 70803 USA
[2] Louisiana State Univ, Dept Biol Sci, Baton Rouge, LA 70803 USA
[3] Louisiana State Univ, Ctr Computat & Technol, Baton Rouge, LA 70803 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
CANCER; INHIBITOR; EFFICACY; SCREEN;
D O I
10.1038/s41598-024-51940-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Combination therapy has gained popularity in cancer treatment as it enhances the treatment efficacy and overcomes drug resistance. Although machine learning (ML) techniques have become an indispensable tool for discovering new drug combinations, the data on drug combination therapy currently available may be insufficient to build high-precision models. We developed a data augmentation protocol to unbiasedly scale up the existing anti-cancer drug synergy dataset. Using a new drug similarity metric, we augmented the synergy data by substituting a compound in a drug combination instance with another molecule that exhibits highly similar pharmacological effects. Using this protocol, we were able to upscale the AZ-DREAM Challenges dataset from 8798 to 6,016,697 drug combinations. Comprehensive performance evaluations show that ML models trained on the augmented data consistently achieve higher accuracy than those trained solely on the original dataset. Our data augmentation protocol provides a systematic and unbiased approach to generating more diverse and larger-scale drug combination datasets, enabling the development of more precise and effective ML models. The protocol presented in this study could serve as a foundation for future research aimed at discovering novel and effective drug combinations for cancer treatment.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects
    Mengmeng Liu
    Gopal Srivastava
    J. Ramanujam
    Michal Brylinski
    Scientific Reports, 14
  • [2] MMFSyn: A Multimodal Deep Learning Model for Predicting Anticancer Synergistic Drug Combination Effect
    Yang, Tao
    Li, Haohao
    Kang, Yanlei
    Li, Zhong
    BIOMOLECULES, 2024, 14 (08)
  • [3] Computational Models Using Multiple Machine Learning Algorithms for Predicting Drug Hepatotoxicity with the DILIrank Dataset
    Ancuceanu, Robert
    Hovanet, Marilena Viorica
    Anghel, Adriana Iuliana
    Furtunescu, Florentina
    Neagu, Monica
    Constantin, Carolina
    Dinu, Mihaela
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (06)
  • [4] Predicting anticancer synergistic drug combinations based on multi-task learning
    Chen, Danyi
    Wang, Xiaowen
    Zhu, Hongming
    Jiang, Yizhi
    Li, Yulong
    Liu, Qi
    Liu, Qin
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [5] Predicting anticancer synergistic drug combinations based on multi-task learning
    Danyi Chen
    Xiaowen Wang
    Hongming Zhu
    Yizhi Jiang
    Yulong Li
    Qi Liu
    Qin Liu
    BMC Bioinformatics, 24
  • [6] Ensemble Voting Schemes that Improve Machine Learning Models for Predicting the Effects of Protein Mutations
    Gunderson, Sarah
    Jagodzinski, Filip
    ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 211 - 219
  • [7] A machine learning framework for predicting synergistic and antagonistic drug combinatorial efficacy
    Suyu Mei
    Journal of Mathematical Chemistry, 2022, 60 : 752 - 769
  • [8] A machine learning framework for predicting synergistic and antagonistic drug combinatorial efficacy
    Mei, Suyu
    JOURNAL OF MATHEMATICAL CHEMISTRY, 2022, 60 (04) : 752 - 769
  • [9] Artificial intelligence and machine learning methods in predicting anti-cancer drug combination effects
    Fan, Kunjie
    Cheng, Lijun
    Li, Lang
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [10] Effects of dataset attacks on machine learning models in e-health
    Tarek Moulahi
    Salim El Khediri
    Durre Nayab
    Mushira Freihat
    Rehan Ullah Khan
    Annals of Telecommunications, 2023, 78 : 655 - 665