Coreset-based Conformal Prediction for Large-scale Learning

被引:0
|
作者
Riquelme-Granada, Nery [1 ]
Khuong An Nguyen [1 ]
Luo, Zhiyuan [1 ]
机构
[1] Royal Holloway Univ London, Dept Comp Sci, Egham TW20 0EX, Surrey, England
关键词
Coreset; logistic regression; importance sampling; conformal predictors; ALGORITHMS; SETS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the volume of data increase rapidly, most traditional machine learning algorithms become computationally prohibitive. Furthermore, the available data can be so big that a single machine's memory can easily be overflown. We propose Coreset-Based Conformal Prediction, a strategy for dealing with big data by applying conformal predictors to a weighted summary of data - namely the coreset. We compare our approach against standalone inductive conformal predictors over three large competition-grade datasets to demonstrate that our coreset-based strategy may not only significantly improve the learning speed, but also retains predictions validity and the predictors' efficiency.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Coreset-Based Neural Network Compression
    Dubey, Abhimanyu
    Chatterjee, Moitreya
    Ahuja, Narendra
    COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 469 - 486
  • [2] Conformal Prediction in Spark: Large-Scale Machine Learning with Confidence
    Capuccini, Marco
    Carlsson, Lars
    Norinder, Ulf
    Spjuth, Ola
    2015 IEEE/ACM 2ND INTERNATIONAL SYMPOSIUM ON BIG DATA COMPUTING (BDC), 2015, : 61 - 67
  • [3] Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning
    Ulf Norinder
    Ola Spjuth
    Fredrik Svensson
    Journal of Cheminformatics, 13
  • [4] Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning
    Norinder, Ulf
    Spjuth, Ola
    Svensson, Fredrik
    JOURNAL OF CHEMINFORMATICS, 2021, 13 (01)
  • [5] A strong coreset algorithm to accelerate OPF as a graph-based machine learning in large-scale problems
    Bostani, Hamid
    Sheikhan, Mansour
    Mahboobi, Behrad
    INFORMATION SCIENCES, 2021, 555 : 424 - 441
  • [6] Fast Coreset-based Diversity Maximization under Matroid Constraints
    Ceccarello, Matteo
    Pietracaprina, Andrea
    Pucci, Geppino
    WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, : 81 - 89
  • [7] Ensemble Learning for Large-Scale Workload Prediction
    Singh, Nidhi
    Rao, Shrisha
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2014, 2 (02) : 149 - 165
  • [8] A General Coreset-Based Approach to Diversity Maximization under Matroid Constraints
    Ceccarello, Matteo
    Pietracaprina, Andrea
    Pucci, Geppino
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (05)
  • [9] Deep Learning With Conformal Prediction for Hierarchical Analysis of Large-Scale Whole-Slide Tissue Images
    Wieslander, Hakan
    Harrison, Philip J.
    Skogberg, Gabriel
    Jackson, Sonya
    Friden, Markus
    Karlsson, Johan
    Spjuth, Ola
    Wahlby, Carolina
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (02) : 371 - 380
  • [10] Large-Scale Machine Learning for Business Sector Prediction
    Angenent, Mitch N.
    Barata, Antonio Pereira
    Takes, Frank W.
    PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 1143 - 1146