Accelerating the SVM learning for very large data sets

被引：0

作者：

Sung, Eric ^{[1
]}

Yan, Zhu ^{[1
]}

Li Xuchun ^{[1
]}

机构：

[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore

来源：

18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS | 2006年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose an original sequential learning algorithm, SBA, that enables the SVM to efficiently learn from only a small subset of the input data set. The principle is based on sequentially adding convex hull points of the binary classes to a small subset. The SVM is trained on the current training pool and its result. is used to find the data which is wrongly classsified and furthest away from the current optimal hyperplane. This point is added to the training pool and the SVM is retrained on it. The iteration stops when no more suchpoints are found A formal proof of strict convergence is provided and we derive a geometric bound on the training time. It will be explained how SBA can be extended to handle non-linearly and non-separable class distributions. Experimental trials on some well known data sets verify the speed advantage of our method coupled to any SVM over that of that SVM used and the core vector machine.

引用

页码：484 / +

页数：2

共 50 条

[11] Joining very large data sets
Johnson, T
Chatziantoniou, D
DATABASES IN TELECOMMUNICATIONS, 2000, 1819 : 118 - 132
[12] Distributed Multi Class SVM for Large Data Sets
Govada, Aruna
Gauri, Bhavul
Sahay, S. K.
PROCEEDING OF THE THIRD INTERNATIONAL SYMPOSIUM ON WOMEN IN COMPUTING AND INFORMATICS (WCI-2015), 2015, : 54 - 58
[13] PCA and PLS with very large data sets
Kettaneh, N
Berglund, A
Wold, S
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 48 (01) : 69 - 85
[14] Clustering Very Large Dissimilarity Data Sets
Hammer, Barbara
Hasenfuss, Alexander
ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, PROCEEDINGS, 2010, 5998 : 259 - +
[15] Managing very large distributed data sets on a data grid
Branco, Miguel
Zaluska, Ed
de Roure, David
Lassnig, Mario
Garonne, Vincent
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2010, 22 (11): : 1338 - 1364
[16] A clustering method for very large mixed data sets
Sánchez-Díaz, G
Ruiz-Shulcloper, J
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 643 - 644
[17] Phase Unwrapping for Very Large Interferometric Data Sets
Zhang, Kui
Ge, Linlin
Hu, Zhe
Alex Hay-Man Ng
Li, Xiaojing
Rizos, Chris
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2011, 49 (10): : 4048 - 4061
[18] A Bayesian spatiotemporal model for very large data sets
Harrison, L. M.
Green, G. G. R.
NEUROIMAGE, 2010, 50 (03) : 1126 - 1141
[19] A genetic algorithm for clustering on very large data sets
Gasvoda, J
Ding, Q
COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 163 - 167
[20] On the interactive visualization of very large image data sets
Ekpar, Frank
Yoneda, Masaaki
Hase, Hiroyuki
2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 627 - 632

← 1 2 3 4 5 →