Approximate inference of the bandwidth in multivariate kernel density estimation

被引:24
|
作者
Filippone, Maurizio [1 ]
Sanguinetti, Guido [2 ]
机构
[1] UCL, Dept Stat Sci, London WC1E 7HB, England
[2] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Midlothian, Scotland
关键词
Kernel density estimation; Bayesian inference; Expectation propagation; Multivariate analysis; CROSS-VALIDATION; BAYESIAN-APPROACH; MONTE-CARLO; SELECTION;
D O I
10.1016/j.csda.2011.05.023
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Kernel density estimation is a popular and widely used non-parametric method for data-driven density estimation. Its appeal lies in its simplicity and ease of implementation, as well as its strong asymptotic results regarding its convergence to the true data distribution. However, a major difficulty is the setting of the bandwidth, particularly in high dimensions and with limited amount of data. An approximate Bayesian method is proposed, based on the Expectation-Propagation algorithm with a likelihood obtained from a leave-one-out cross validation approach. The proposed method yields an iterative procedure to approximate the posterior distribution of the inverse bandwidth. The approximate posterior can be used to estimate the model evidence for selecting the structure of the bandwidth and approach online learning. Extensive experimental validation shows that the proposed method is competitive in terms of performance with state-of-the-art plug-in methods. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:3104 / 3122
页数:19
相关论文
共 50 条