Fast Support Vector Machine classification of very large datasets

被引：4

作者：

Fehr, Janis ^{[1
]}

Arreola, Karina Zapien ^{[2
]}

Burkhardt, Hans

机构：

[1] Univ Freiburg, Chair Pattern Recognit & Image Proc, D-79110 Freiburg, Germany

[2] INSA Rouen, LITIS, F-76801 Rouen, France

来源：

DATA ANALYSIS, MACHINE LEARNING AND APPLICATIONS | 2008年

关键词：

D O I：

10.1007/978-3-540-78246-9_2

中图分类号：

F [经济];

学科分类号：

02 ;

摘要：

In many classification applications, Support Vector Machines (SVMs) have proven to be highly performing and easy to handle classifiers with very good generalization abilities. However, one drawback of the SVM is its rather high classification complexity which scales linearly with the number of Support Vectors (SVs). This is due to the fact that for the classification of one sample, the kernel function has to be evaluated for all SVs. To speed up classification, different approaches have been published, most which of try to reduce the number of SVs. In our work, which is especially suitable for very large datasets, we follow a different approach: as we showed in (Zapien et al. 2006), it is effectively possible to approximate large SVM problems by decomposing the original problem into linear subproblems, where each subproblem can be evaluated in Omega(1). This approach is especially successful, when the assumption holds that a large classification problem can be split into mainly easy and only a few hard subproblems. On standard benchmark datasets, this approach achieved great speedups while suffering only sightly in terms of classification accuracy and generalization ability. In this contribution, we extend the methods introduced in (Zapien et al. 2006) using not only linear, but also non-linear subproblems for the decomposition of the original problem which further increases the classification performance with only a little loss in terms of speed. An implementation of our method is available in (Ronneberger and et al.) Due to page limitations, we had to move some of theoretic details (e.g. proofs) and extensive experimental results to a technical report (Zapien et al. 2007).

引用

页码：11 / +

页数：2

共 50 条

[1] Mining very large datasets with support vector machine algorithms
Poulet, F
Do, TN
[J]. ENTERPRISE INFORMATION SYSTEMS V, 2004, : 177 - 184
[2] Multiresolution hierarchical support vector machine for classification of large datasets
Alwajidi, Safaa
Yang, Li
[J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (12) : 3447 - 3462
[3] Multiresolution hierarchical support vector machine for classification of large datasets
Safaa Alwajidi
Li Yang
[J]. Knowledge and Information Systems, 2022, 64 : 3447 - 3462
[4] Fast Support Vector Machine Classification for Large Data Sets
Xiaoou Li
Wen Yu
[J]. International Journal of Computational Intelligence Systems, 2014, 7 : 197 - 212
[5] Fast Support Vector Machine Classification for Large Data Sets
Li, Xiaoou
Yu, Wen
[J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2014, 7 (02) : 197 - 212
[6] Support vector machine classification of physical and biological datasets
Cai, CZ
Wang, WL
Chen, YZ
[J]. INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2003, 14 (05): : 575 - 585
[7] Fast Local Support Vector Machines for Large Datasets
Segata, Nicola
Blanzieri, Enrico
[J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2009, 5632 : 295 - +
[8] Fast, Approximate Vector Queries on Very Large Unstructured Datasets
Zhang, Zili
Jin, Chao
Tang, Linpeng
Liu, Xuanzhe
Jin, Xin
[J]. PROCEEDINGS OF THE 20TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, NSDI 2023, 2023, : 995 - 1011
[9] Support vector machines with clustering for training with very large datasets
Evgeniou, T
Pontil, M
[J]. METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2002, 2308 : 346 - 354
[10] Support vector machine approach for fast classification
Kianmehr, Keivan
Alhajj, Reda
[J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 534 - 543

← 1 2 3 4 5 →