Machine learning friendly set version of Johnson-Lindenstrauss lemma

被引:0
|
作者
Klopotek, Mieczyslaw A. [1 ]
机构
[1] Polish Acad Sci, Inst Comp Sci, Ul Jana Kazimierza 5, PL-01248 Warsaw, Poland
关键词
Johnson-Lindenstrauss lemma; Random projection; Sample distortion; Dimensionality reduction; Linear JL transform; k-means algorithm; Clusterability retention; RANDOM-PROJECTION; PROOF;
D O I
10.1007/s10115-019-01412-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The widely discussed and applied Johnson-Lindenstrauss (JL) Lemmahas an existential form saying that for each set of data points Q in n-dimensional space, there exists a transformation f into an n'-dimensional space (n' < n) such that for each pair u, v is an element of Q (1 - delta) parallel to u - v parallel to(2) <= parallel to f (u) - f (v)parallel to(2) <= (1 + delta)parallel to u - v parallel to(2) for a user-defined error parameter delta. Furthermore, it is asserted that with some finite probability the transformation f may be found as a random projection (with scaling) onto the n' dimensional subspace so that after sufficiently many repetitions of random projection, f will be found with user-defined success rate 1 - epsilon. In this paper, we make a novel use of the JL Lemma. We prove a theorem stating that we can choose the target dimensionality in a random projection-type JL linear transformation in such a way that with probability 1 - epsilon all of data points from Q fall into predefined error range d for any user-predefined failure probability epsilon when performing a single random projection. This result is important for applications such as data clustering where we want to have a priori dimensionality reducing transformation instead of attempting a (large) number of them, as with traditional Johnson-Lindenstrauss Lemma. Furthermore, we investigate an important issue whether or not the projection according to JL Lemma is really useful when conducting data processing, that is whether the solutions to the clustering in the projected space apply to the original space. In particular, we take a closer look at the k-means algorithm and prove that a good solution in the projected space is also a good solution in the original space. Furthermore, under proper assumptions local optima in the original space are also ones in the projected space. We investigate also a broader issue of preserving clusterability under JL Lemma projection. We define the conditions for which clusterability property of the original space is transmitted to the projected space, so that a broad class of clustering algorithms for the original space is applicable in the projected space.
引用
收藏
页码:1961 / 2009
页数:49
相关论文
共 50 条
  • [31] An Almost Optimal Unrestricted Fast Johnson-Lindenstrauss Transform
    Ailon, Nir
    Liberty, Edo
    PROCEEDINGS OF THE TWENTY-SECOND ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2011, : 185 - 191
  • [32] THE FAST JOHNSON-LINDENSTRAUSS TRANSFORM AND APPROXIMATE NEAREST NEIGHBORS
    Ailon, Nir
    Chazelle, Bernard
    SIAM JOURNAL ON COMPUTING, 2009, 39 (01) : 302 - 322
  • [33] Private Query Release via the Johnson-Lindenstrauss Transform
    Nikolov, Aleksandar
    PROCEEDINGS OF THE 2023 ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2023, : 4982 - 5002
  • [34] On Using Toeplitz and Circulant Matrices for Johnson-Lindenstrauss Transforms
    Freksen, Casper Benjamin
    Larsen, Kasper Green
    ALGORITHMICA, 2020, 82 (02) : 338 - 354
  • [35] Differential Private POI Queries via Johnson-Lindenstrauss Transform
    Yang, Mengmeng
    Zhu, Tianqing
    Liu, Bo
    Xiang, Yang
    Zhou, Wanlei
    IEEE ACCESS, 2018, 6 : 29685 - 29699
  • [36] Privacy Preserving Collaborative Filtering via the Johnson-Lindenstrauss Transform
    Yang, Mengmeng
    Zhu, Tianqing
    Ma, Lichuan
    Xiang, Yang
    Zhou, Wanlei
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS / 11TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING / 14TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2017, : 417 - 424
  • [37] Accelerating Image Registration With the Johnson-Lindenstrauss Lemma: Application to Imaging 3-D Neural Ultrastructure With Electron Microscopy
    Akselrod-Ballin, Ayelet
    Bock, Davi
    Reid, R. Clay
    Warfield, Simon K.
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2011, 30 (07) : 1427 - 1438
  • [38] Practical Johnson-Lindenstrauss Transforms via Algebraic Geometry Codes
    You, Lin
    Knoll, Fiona
    Mao, Yue
    Gao, Shuhong
    2017 INTERNATIONAL CONFERENCE ON CONTROL, ARTIFICIAL INTELLIGENCE, ROBOTICS & OPTIMIZATION (ICCAIRO), 2017, : 171 - 176
  • [39] Extremely Sparse Johnson-Lindenstrauss Transform: From Theory to Algorithm
    Yin, Rong
    Liu, Yong
    Wang, Weiping
    Meng, Dan
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 1376 - 1381
  • [40] Fast Johnson-Lindenstrauss Transform for Robust and Secure Image Hashing
    Lv, Xudong
    Wang, Z. Jane
    2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 729 - 733