Estimating the support of a high-dimensional distribution

被引:3596
|
作者
Schölkopf, B
Platt, JC
Shawe-Taylor, J
Smola, AJ
Williamson, RC
机构
[1] Microsoft Res Ltd, Cambridge CB2 3NH, England
[2] Microsoft Res, Redmond, WA 98052 USA
[3] Univ London Royal Holloway & Bedford New Coll, Egham TW20 0EX, Surrey, England
[4] Australian Natl Univ, Dept Engn, Canberra, ACT 0200, Australia
关键词
D O I
10.1162/089976601750264965
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.
引用
收藏
页码:1443 / 1471
页数:29
相关论文
共 50 条
  • [41] Estimating high-dimensional directed acyclic graphs with the PC-algorithm
    Seminar für Statistik, ETH Zurich, 8092 Zürich, Switzerland
    J. Mach. Learn. Res., 2007, (613-636):
  • [42] ACE of space: estimating genetic components of high-dimensional imaging data
    Risk, Benjamin B.
    Zhu, Hongtu
    BIOSTATISTICS, 2021, 22 (01) : 131 - 147
  • [43] Penalized Generalized Estimating Equations for High-Dimensional Longitudinal Data Analysis
    Wang, Lan
    Zhou, Jianhui
    Qu, Annie
    BIOMETRICS, 2012, 68 (02) : 353 - 360
  • [44] Estimating high-dimensional directed acyclic graphs with the PC-algorithm
    Kalisch, Markus
    Buehlmann, Peter
    JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 613 - 636
  • [45] Estimating the Number of Latent Factors in High-Dimensional Financial Time Series
    Keranovic, Vanessa
    Begusic, Stjepan
    Kostanjcar, Zvonko
    2020 28TH INTERNATIONAL CONFERENCE ON SOFTWARE, TELECOMMUNICATIONS AND COMPUTER NETWORKS (SOFTCOM), 2020, : 462 - 466
  • [47] Cerebellar granule cell axons support high-dimensional representations
    Lanore, Frederic
    Cayco-Gajic, N. Alex
    Gurnani, Harsha
    Coyle, Diccon
    Silver, R. Angus
    NATURE NEUROSCIENCE, 2021, 24 (08) : 1142 - 1150
  • [48] Cerebellar granule cell axons support high-dimensional representations
    Frederic Lanore
    N. Alex Cayco-Gajic
    Harsha Gurnani
    Diccon Coyle
    R. Angus Silver
    Nature Neuroscience, 2021, 24 : 1142 - 1150
  • [49] Robust support vector machine for high-dimensional imbalanced data
    Nakayama, Yugo
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (05) : 1524 - 1540
  • [50] Implementing KDB-trees to support high-dimensional data
    Orlandic, R
    Yu, BG
    2001 INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2001, : 58 - 67