Fast methods for training Gaussian processes on large datasets

被引:32
|
作者
Moore, C. J. [1 ]
Chua, A. J. K. [1 ]
Berry, C. P. L. [2 ]
Gair, J. R. [3 ,4 ]
机构
[1] Univ Cambridge, Inst Astron, Madingley Rd, Cambridge CB3 0HA, England
[2] Univ Birmingham, Sch Phys & Astron, Birmingham B15 2TT, W Midlands, England
[3] Univ Edinburgh, Sch Math, James Clerk Maxwell Bldg,Peter Guthrie Tait Rd, Edinburgh EH9 3FD, Midlothian, Scotland
[4] Biomathemat & Stat Scotland, James Clerk Maxwell Bldg,Peter Guthrie Tait Rd, Edinburgh EH9 3FD, Midlothian, Scotland
来源
ROYAL SOCIETY OPEN SCIENCE | 2016年 / 3卷 / 05期
关键词
Gaussian processes; regression; data analysis; inference; PROCESS REGRESSION; EFFICIENT; HYPERPARAMETERS; SELECTION;
D O I
10.1098/rsos.160125
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large datasets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.
引用
下载
收藏
页数:10
相关论文
共 50 条
  • [1] Fast Direct Methods for Gaussian Processes
    Ambikasaran, Sivaram
    Foreman-Mackey, Daniel
    Greengard, Leslie
    Hogg, David W.
    O'Neil, Michael
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (02) : 252 - 265
  • [2] A Fast SVM Training Method for Very Large Datasets
    Li, Boyang
    Wang, Qiangwei
    Hu, Jinglu
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 277 - 282
  • [3] Fast SVM training using edge detection on very large datasets
    Li, Boyang
    Wang, Qiangwei
    Hu, Jinglu
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2013, 8 (03) : 229 - 237
  • [4] Datasets, tasks, and training methods for large-scale hypergraph learning
    Kim, Sunwoo
    Lee, Dongjin
    Kim, Yul
    Park, Jungho
    Hwang, Taeho
    Shin, Kijung
    DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 37 (06) : 2216 - 2254
  • [5] Datasets, tasks, and training methods for large-scale hypergraph learning
    Sunwoo Kim
    Dongjin Lee
    Yul Kim
    Jungho Park
    Taeho Hwang
    Kijung Shin
    Data Mining and Knowledge Discovery, 2023, 37 : 2216 - 2254
  • [6] Fast SVM training using data reconstruction for classification of very large datasets
    Liang, Peileng
    Li, Weite
    Hu, Jinglu
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2020, 15 (03) : 372 - 381
  • [7] Bayesian dynamic modeling for large space-time datasets using Gaussian predictive processes
    Finley, Andrew O.
    Banerjee, Sudipto
    Gelfand, Alan E.
    JOURNAL OF GEOGRAPHICAL SYSTEMS, 2012, 14 (01) : 29 - 47
  • [8] Efficient Gaussian process regression for large datasets
    Banerjee, Anjishnu
    Dunson, David B.
    Tokdar, Surya T.
    BIOMETRIKA, 2013, 100 (01) : 75 - 89
  • [9] Bayesian dynamic modeling for large space-time datasets using Gaussian predictive processes
    Andrew O. Finley
    Sudipto Banerjee
    Alan E. Gelfand
    Journal of Geographical Systems, 2012, 14 : 29 - 47
  • [10] Fast Support Vector Data Description Training Using Edge Detection on Large Datasets
    Hu, Chenlong
    Zhou, Bo
    Hu, Jinglu
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 2176 - 2182