Using Locality-Sensitive Hashing for SVM Classification of Large Data Sets

被引:4
|
作者
Gonzalez-Lima, Maria D. [1 ]
Ludena, Carenne C. [2 ]
机构
[1] Univ Norte, Dept Matemat & Estadist, Barranquilla 081007, Colombia
[2] Matrix CPM Solut, Crr 15 93A 84, Bogota 110221, Colombia
关键词
support vector machines; locality sensitive hashing; classification problems; SUPPORT; MACHINES;
D O I
10.3390/math10111812
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We propose a novel method using Locality-Sensitive Hashing (LSH) for solving the optimization problem that arises in the training stage of support vector machines for large data sets, possibly in high dimensions. LSH was introduced as an efficient way to look for neighbors in high dimensional spaces. Random projections-based LSH functions create bins so that when great probability points belonging to the same bin are close, the points that are far will not be in the same bin. Based on these bins, it is not necessary to consider the whole original set but representatives in each one of them, thus reducing the effective size of the data set. A key of our proposal is that we work with the feature space and use only the projections to search for closeness in this space. Moreover, instead of choosing the projection directions at random, we sample a small subset and solve the associated SVM problem. Projections in this direction allows for a more precise sample in many cases and an approximation of the solution of the large problem is found in a fraction of the running time with small degradation of the classification error. We present two algorithms, theoretical support, and numerical experiments showing their performances on real life problems taken from the LIBSVM data base.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Learnable Locality-Sensitive Hashing for Video Anomaly Detection
    Lu, Yue
    Cao, Congqi
    Zhang, Yifan
    Zhang, Yanning
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 963 - 976
  • [42] Efficient Data Stream Clustering with Sliding Windows based on Locality-Sensitive Hashing
    Youn, Jonghem
    Shim, Junho
    Lee, Sang-Goo
    [J]. IEEE ACCESS, 2018, 6 : 63757 - 63776
  • [43] Kernelized Locality-Sensitive Hashing for Scalable Image Search
    Kulis, Brian
    Grauman, Kristen
    [J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 2130 - 2137
  • [45] A method using locality-sensitive hashing for large-scale content-based image retrieval
    Wang Weihong
    Wang Song
    [J]. CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 1816 - 1820
  • [46] Locality-sensitive hashing for region-based large-scale image indexing
    Gallas, Abir
    Barhoumi, Walid
    Kacem, Neila
    Zagrouba, Ezzeddine
    [J]. IET IMAGE PROCESSING, 2015, 9 (09) : 804 - 810
  • [47] Revisiting Kernelized Locality-Sensitive Hashing for Improved Large-Scale Image Retrieval
    Jiang, Ke
    Que, Qichao
    Kulis, Brian
    [J]. 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 4933 - 4941
  • [48] Efficient Outlier Detection in Hyperedge Streams Using MinHash and Locality-Sensitive Hashing
    Ranshous, Stephen
    Chaudhary, Mandar
    Samatova, Nagiza F.
    [J]. COMPLEX NETWORKS & THEIR APPLICATIONS VI, 2018, 689 : 105 - 116
  • [49] Reducing the Complexity of Fingerprinting-Based Positioning using Locality-Sensitive Hashing
    Tang, Larry
    Ghods, Ramina
    Studer, Christoph
    [J]. CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1086 - 1090
  • [50] Hardware acceleration of k-mer clustering using locality-sensitive hashing
    Soto, Javier E.
    Krohmer, Thomas
    Hernandez, Cecilia
    Figueroa, Miguel
    [J]. 2019 22ND EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2019, : 659 - 662