Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning

被引：0

作者：

Norinder, Ulf ^{[1
,2
,3
,4
]}

Spjuth, Ola ^{[1
,2
]}

Svensson, Fredrik ^{[5
]}

机构：

[1] Uppsala Univ, Dept Pharmaceut Biosci, Box 591, SE-75124 Uppsala, Sweden

[2] Uppsala Univ, Sci Life Lab, Box 591, SE-75124 Uppsala, Sweden

[3] Stockholm Univ, Dept Comp & Syst Sci, Box 7003, S-16407 Kista, Sweden

[4] Orebro Univ, MTM Res Ctr, Sch Sci & Technol, S-70182 Orebro, Sweden

[5] UCL, Alzheimers Res UK UCL Drug Discovery Inst, Cruciform Bldg,Gower St, London WC1E 6BT, England

来源：

JOURNAL OF CHEMINFORMATICS | 2021年 / 13卷 / 01期

关键词：

Conformal prediction; Federated learning; Confidence; Machine learning;

D O I：

10.1186/s13321-021-00555-7

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Confidence predictors can deliver predictions with the associated confidence required for decision making and can play an important role in drug discovery and toxicity predictions. In this work we investigate a recently introduced version of conformal prediction, synergy conformal prediction, focusing on the predictive performance when applied to bioactivity data. We compare the performance to other variants of conformal predictors for multiple partitioned datasets and demonstrate the utility of synergy conformal predictors for federated learning where data cannot be pooled in one location. Our results show that synergy conformal predictors based on training data randomly sampled with replacement can compete with other conformal setups, while using completely separate training sets often results in worse performance. However, in a federated setup where no method has access to all the data, synergy conformal prediction is shown to give promising results. Based on our study, we conclude that synergy conformal predictors are a valuable addition to the conformal prediction toolbox.

引用

页数：11

共 50 条

[41] Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets
Calian, Dan Andrei
Bacardit, Jaume
[J]. MEMETIC COMPUTING, 2013, 5 (02) : 95 - 130
[42] Integrating memetic search into the BioHEL evolutionary learning system for large-scale datasets
Dan Andrei Calian
Jaume Bacardit
[J]. Memetic Computing, 2013, 5 : 95 - 130
[43] A scalable association rule learning and recommendation algorithm for large-scale microarray datasets
Haosong Li
Phillip C.-Y. Sheu
[J]. Journal of Big Data, 9
[44] Supervised machine learning for diagnostic classification from large-scale neuroimaging datasets
Lanka, Pradyumna
Rangaprakash, D.
Dretsch, Michael N.
Katz, Jeffrey S.
Denney, Thomas S., Jr.
Deshpande, Gopikrishna
[J]. BRAIN IMAGING AND BEHAVIOR, 2020, 14 (06) : 2378 - 2416
[45] A scalable association rule learning and recommendation algorithm for large-scale microarray datasets
Li, Haosong
Sheu, Phillip C-Y
[J]. JOURNAL OF BIG DATA, 2022, 9 (01)
[46] Supervised machine learning for diagnostic classification from large-scale neuroimaging datasets
Pradyumna Lanka
D Rangaprakash
Michael N. Dretsch
Jeffrey S. Katz
Thomas S. Denney
Gopikrishna Deshpande
[J]. Brain Imaging and Behavior, 2020, 14 : 2378 - 2416
[47] RANSAC-SVM for Large-Scale Datasets
Nishida, Kenji
Kurita, Takio
[J]. 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3767 - 3770
[48] MedDialog: Large-scale Medical Dialogue Datasets
Zeng, Guangtao
Yang, Wenmian
Ju, Zeqian
Yang, Yue
Wang, Sicheng
Zhang, Ruisi
Zhou, Meng
Zeng, Jiaqi
Dong, Xiangyu
Zhang, Ruoyu
Fang, Hongchao
Zhu, Penghui
Chen, Shu
Xie, Pengtao
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9241 - 9250
[49] Towards algorithmic analytics for large-scale datasets
Bzdok, Danilo
Nichols, Thomas E.
Smith, Stephen M.
[J]. NATURE MACHINE INTELLIGENCE, 2019, 1 (07) : 296 - 306
[50] Map Matching Algorithm for Large-scale Datasets
Fiedler, David
Cap, Michal
Nykl, Jan
Zilecky, Pavol
[J]. ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 500 - 508

← 1 2 3 4 5 →