Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning

被引:0
|
作者
Norinder, Ulf [1 ,2 ,3 ,4 ]
Spjuth, Ola [1 ,2 ]
Svensson, Fredrik [5 ]
机构
[1] Uppsala Univ, Dept Pharmaceut Biosci, Box 591, SE-75124 Uppsala, Sweden
[2] Uppsala Univ, Sci Life Lab, Box 591, SE-75124 Uppsala, Sweden
[3] Stockholm Univ, Dept Comp & Syst Sci, Box 7003, S-16407 Kista, Sweden
[4] Orebro Univ, MTM Res Ctr, Sch Sci & Technol, S-70182 Orebro, Sweden
[5] UCL, Alzheimers Res UK UCL Drug Discovery Inst, Cruciform Bldg,Gower St, London WC1E 6BT, England
关键词
Conformal prediction; Federated learning; Confidence; Machine learning;
D O I
10.1186/s13321-021-00555-7
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Confidence predictors can deliver predictions with the associated confidence required for decision making and can play an important role in drug discovery and toxicity predictions. In this work we investigate a recently introduced version of conformal prediction, synergy conformal prediction, focusing on the predictive performance when applied to bioactivity data. We compare the performance to other variants of conformal predictors for multiple partitioned datasets and demonstrate the utility of synergy conformal predictors for federated learning where data cannot be pooled in one location. Our results show that synergy conformal predictors based on training data randomly sampled with replacement can compete with other conformal setups, while using completely separate training sets often results in worse performance. However, in a federated setup where no method has access to all the data, synergy conformal prediction is shown to give promising results. Based on our study, we conclude that synergy conformal predictors are a valuable addition to the conformal prediction toolbox.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning
    Ulf Norinder
    Ola Spjuth
    Fredrik Svensson
    [J]. Journal of Cheminformatics, 13
  • [2] Coreset-based Conformal Prediction for Large-scale Learning
    Riquelme-Granada, Nery
    Khuong An Nguyen
    Luo, Zhiyuan
    [J]. CONFORMAL AND PROBABILISTIC PREDICTION AND APPLICATIONS, VOL 105, 2019, 105
  • [3] Conformal Prediction in Spark: Large-Scale Machine Learning with Confidence
    Capuccini, Marco
    Carlsson, Lars
    Norinder, Ulf
    Spjuth, Ola
    [J]. 2015 IEEE/ACM 2ND INTERNATIONAL SYMPOSIUM ON BIG DATA COMPUTING (BDC), 2015, : 61 - 67
  • [4] Learning to Index in Large-Scale Datasets
    Prayoonwong, Amorntip
    Wang, Cheng-Hsien
    Chiu, Chih-Yi
    [J]. MULTIMEDIA MODELING, MMM 2018, PT I, 2018, 10704 : 305 - 316
  • [5] Large-Scale Secure XGB for Vertical Federated Learning
    Fang, Wenjing
    Zhao, Derun
    Tan, Jin
    Chen, Chaochao
    Yu, Chaofan
    Wang, Li
    Wang, Lei
    Zhou, Jun
    Zhang, Benyu
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 443 - 452
  • [6] Validating the validation: reanalyzing a large-scale comparison of deep learning and machine learning models for bioactivity prediction
    Matthew C. Robinson
    Robert C. Glen
    Alpha A. Lee
    [J]. Journal of Computer-Aided Molecular Design, 2020, 34 : 717 - 730
  • [7] Validating the validation: reanalyzing a large-scale comparison of deep learning and machine learning models for bioactivity prediction
    Robinson, Matthew C.
    Glen, Robert C.
    Lee, Alpha A.
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2020, 34 (07) : 717 - 730
  • [8] RETRACTED: Large-Scale Textual Datasets and Deep Learning for the Prediction of Depressed Symptoms (Retracted Article)
    Chakraborty, Sudeshna
    Mahdi, Hussain Falih
    Al-Abyadh, Mohammed Hasan Ali
    Pant, Kumud
    Sharma, Aditi
    Ahmadi, Fardin
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [9] Spatial Convergence of Federated Learning in Large-Scale Cellular Networks
    Lin, Zhenyi
    Li, Xiaoyang
    Lau, Vincent K. N.
    Gong, Yi
    Huang, Kaibin
    [J]. SPAWC 2021: 2021 IEEE 22ND INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (IEEE SPAWC 2021), 2020, : 231 - 235
  • [10] Learning Bayesian Network Structure from Large-scale Datasets
    Hong, Yu
    Xia, Xiaoling
    Le, Jiajin
    Zhou, Xiangdong
    [J]. 2016 FOURTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD 2016), 2016, : 258 - 264