Tight Dimensionality Reduction for Sketching Low Degree Polynomial Kernels

被引：0

作者：

Meister, Michela ^{[1
,2
]}

Sarlos, Tamas ^{[2
]}

Woodruff, David P. ^{[2
,3
,4
]}

机构：

[1] Cornell Univ, Ithaca, NY 14850 USA

[2] Google Res, Mountain View, CA 94043 USA

[3] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA

[4] Simons Inst Theory Comp, Berkeley, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) | 2019年 / 32卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We revisit the classic randomized sketch of a tensor product of q vectors x(i) is an element of R-n. The i-th coordinate (Sx)(i) of the sketch is equal to Pi(q)(j=1) (u(i,j), x(j))/ root m, where u(i,j) are independent random sign vectors. Kar and Karnick (JMLR, 2012) show that if the sketching dimension m = Omega(epsilon(-2) C(Omega)(2)log(1/delta)), where C-Omega is a certain property of the point set Omega one wants to sketch, then with probability 1 - delta, parallel to Sx parallel to(2) = (1 +/- epsilon)parallel to x parallel to(2) for all x epsilon Omega. However, in their analysis C-Omega(2) can be as large as Theta(n(2q)), even for a set Omega of O(1) vectors x. We give a new analysis of this sketch, providing nearly optimal bounds. Namely, we show an upper bound of m = Theta(epsilon(-2) log(1/delta) + epsilon(-1) logq(n/delta)), which by composing with CountSketch, can be improved to Theta(epsilon(-2) log(1/delta epsilon) + epsilon(-1) log(q) (1/(delta epsilon)). For the important case of q = 2 and delta = 1/poly(n), this shows that m = Omega(epsilon(-2) log(n) + epsilon(-1) log(2)(n)), demonstrating that the epsilon(-2) and log(2)(n) terms do not multiply each other. We also show a nearly matching lower bound of m = Omega(epsilon(-2) log(1/delta)) + epsilon(-1) log(q) (1/(delta))). In a number of applications, one has vertical bar Omega vertical bar = poly(n) and in this case our bounds are optimal up to a constant factor. This is the first high probability sketch for tensor products that has optimal sketch size and can be implemented in m . Sigma(q)(i=1) nnz(x(i)) time, where nnz(x(i)) is the number of non -zero entries of x(i). Lastly, we empirically compare our sketch to other sketches for tensor products, and give a novel application to compressing neural networks.

引用

页数：12

共 50 条

[1] Fast Sketching of Polynomial Kernels of Polynomial Degree
Song, Zhao
Woodruff, David P.
Yu, Zheng
Zhang, Lichen
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[2] Oblivious Sketching of High-Degree Polynomial Kernels
Ahle, Thomas D.
Kapralov, Michael
Knudsen, Jakob B. T.
Pagh, Rasmus
Velingker, Ameya
Woodruff, David P.
Zandieh, Amir
[J]. PROCEEDINGS OF THE THIRTY-FIRST ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA'20), 2020, : 141 - 160
[3] Oblivious Sketching of High-Degree Polynomial Kernels
Ahle, Thomas D.
Kapralov, Michael
Knudsen, Jakob B. T.
Pagh, Rasmus
Velingker, Ameya
Woodruff, David P.
Zandieh, Amir
[J]. PROCEEDINGS OF THE 2020 ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2020, : 141 - 160
[4] Dimensionality reduction of SDPs through sketching
Bluhm, Andreas
Franca, Daniel Stilck
[J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2019, 563 : 461 - 475
[5] Dimensionality reduction with adaptive kernels
Yan, Shuicheng
Tang, Xiaoou
[J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 626 - +
[6] Sketching, Embedding, and Dimensionality Reduction for Information Spaces
Abdullah, Amirali
Kumar, Ravi
McGregor, Andrew
Vassilvitskii, Sergei
Venkatasubramanian, Suresh
[J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 948 - 956
[7] Dimensionality Reduction for Speech Emotion Features by Multiscale Kernels
Xu, Xinzhou
Deng, Jun
Zheng, Wenming
Zhao, Li
Schuller, Bjoern
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1532 - 1536
[8] Polynomial Kernels for Vertex Cover Parameterized by Small Degree Modulators
Majumdar, Diptapriyo
Raman, Venkatesh
Saurabh, Saket
[J]. THEORY OF COMPUTING SYSTEMS, 2018, 62 (08) : 1910 - 1951
[9] Polynomial Kernels for Vertex Cover Parameterized by Small Degree Modulators
Diptapriyo Majumdar
Venkatesh Raman
Saket Saurabh
[J]. Theory of Computing Systems, 2018, 62 : 1910 - 1951
[10] Dimensionality reduction for production optimization using polynomial approximations
Sorek, Nadav
Gildin, Eduardo
Boukouvala, Fani
Beykal, Burcu
Floudas, Christodoulos A.
[J]. COMPUTATIONAL GEOSCIENCES, 2017, 21 (02) : 247 - 266

← 1 2 3 4 5 →