Fast Support Vector Machine classification of very large datasets

被引:4
|
作者
Fehr, Janis [1 ]
Arreola, Karina Zapien [2 ]
Burkhardt, Hans
机构
[1] Univ Freiburg, Chair Pattern Recognit & Image Proc, D-79110 Freiburg, Germany
[2] INSA Rouen, LITIS, F-76801 Rouen, France
关键词
D O I
10.1007/978-3-540-78246-9_2
中图分类号
F [经济];
学科分类号
02 ;
摘要
In many classification applications, Support Vector Machines (SVMs) have proven to be highly performing and easy to handle classifiers with very good generalization abilities. However, one drawback of the SVM is its rather high classification complexity which scales linearly with the number of Support Vectors (SVs). This is due to the fact that for the classification of one sample, the kernel function has to be evaluated for all SVs. To speed up classification, different approaches have been published, most which of try to reduce the number of SVs. In our work, which is especially suitable for very large datasets, we follow a different approach: as we showed in (Zapien et al. 2006), it is effectively possible to approximate large SVM problems by decomposing the original problem into linear subproblems, where each subproblem can be evaluated in Omega(1). This approach is especially successful, when the assumption holds that a large classification problem can be split into mainly easy and only a few hard subproblems. On standard benchmark datasets, this approach achieved great speedups while suffering only sightly in terms of classification accuracy and generalization ability. In this contribution, we extend the methods introduced in (Zapien et al. 2006) using not only linear, but also non-linear subproblems for the decomposition of the original problem which further increases the classification performance with only a little loss in terms of speed. An implementation of our method is available in (Ronneberger and et al.) Due to page limitations, we had to move some of theoretic details (e.g. proofs) and extensive experimental results to a technical report (Zapien et al. 2007).
引用
收藏
页码:11 / +
页数:2
相关论文
共 50 条
  • [1] Mining very large datasets with support vector machine algorithms
    Poulet, F
    Do, TN
    [J]. ENTERPRISE INFORMATION SYSTEMS V, 2004, : 177 - 184
  • [2] Multiresolution hierarchical support vector machine for classification of large datasets
    Alwajidi, Safaa
    Yang, Li
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (12) : 3447 - 3462
  • [3] Multiresolution hierarchical support vector machine for classification of large datasets
    Safaa Alwajidi
    Li Yang
    [J]. Knowledge and Information Systems, 2022, 64 : 3447 - 3462
  • [4] Fast Support Vector Machine Classification for Large Data Sets
    Xiaoou Li
    Wen Yu
    [J]. International Journal of Computational Intelligence Systems, 2014, 7 : 197 - 212
  • [5] Fast Support Vector Machine Classification for Large Data Sets
    Li, Xiaoou
    Yu, Wen
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2014, 7 (02) : 197 - 212
  • [6] Support vector machine classification of physical and biological datasets
    Cai, CZ
    Wang, WL
    Chen, YZ
    [J]. INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2003, 14 (05): : 575 - 585
  • [7] Fast Local Support Vector Machines for Large Datasets
    Segata, Nicola
    Blanzieri, Enrico
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2009, 5632 : 295 - +
  • [8] Fast, Approximate Vector Queries on Very Large Unstructured Datasets
    Zhang, Zili
    Jin, Chao
    Tang, Linpeng
    Liu, Xuanzhe
    Jin, Xin
    [J]. PROCEEDINGS OF THE 20TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, NSDI 2023, 2023, : 995 - 1011
  • [9] Support vector machines with clustering for training with very large datasets
    Evgeniou, T
    Pontil, M
    [J]. METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2002, 2308 : 346 - 354
  • [10] Support vector machine approach for fast classification
    Kianmehr, Keivan
    Alhajj, Reda
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 534 - 543