S-HOT: Scalable High-Order Tucker Decomposition

被引：30

作者：

Oh, Jinoh ^{[1
,4
]}

Shin, Kijung ^{[2
]}

Papalexakis, Evangelos E. ^{[3
]}

Faloutsos, Christos ^{[2
]}

Yu, Hwanjo ^{[1
]}

机构：

[1] POSTECH, Dept Comp Sci & Engn, Pohang, South Korea

[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA

[3] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA

[4] CMU, Pittsburgh, PA USA

来源：

WSDM'17: PROCEEDINGS OF THE TENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING | 2017年

关键词：

D O I：

10.1145/3018661.3018721

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Multi-aspect data appear frequently in many web-related applications. For example, product reviews are quadruplets of (user, product, keyword, timestamp). How can we analyze such web -scale multi-aspect data? Can we analyze them on an off-the-shelf workstation with limited amount of memory? Tucker decomposition has been widely used for discovering patterns in relationships among entities in multi-aspect data, naturally expressed as high-order tensors. However, existing algorithms for Tucker decomposition have limited scalability, and especially, fail to decompose high-order tensors since they explicitly materialize intermediate data, whose size rapidly grows as the order increases (>= 4). We call this problem M-Bottleneck ("Materialization Bottleneck"). To avoid M-Bottleneck, we propose S-HOT, a scalable high-order tucker decomposition method that employs the on-the-fly-computation to minimize the materialized intermediate data. Moreover, S-HOT is designed for handling disk-resident tensors, too large to fit in memory, without loading them all in memory at once. We provide theoretical analysis on the amount of memory space and the number of scans of data required by S-HOT. In our experiments, S-HOT showed better scalability not only with the order but also with the dimensionality and the rank than baseline methods. In particular, S-HOT decomposed tensors 1000 x larger than baseline methods in terms dimensionality. S-HOT also successfully analyzed real-world tensors that are both large-scale and high-order on an off-the-shelf workstation with limited amount of memory, while baseline methods failed.

引用

页码：761 / 770

页数：10

共 50 条

[1] Efficient Parallel Sparse Symmetric Tucker Decomposition for High-Order Tensors
Shivakumar, Shruti
Li, Jiajia
Kannan, Ramakrishnan
Aluru, Srinivas
[J]. PROCEEDINGS OF THE 2021 SIAM CONFERENCE ON APPLIED AND COMPUTATIONAL DISCRETE ALGORITHMS, ACDA21, 2021, : 193 - 204
[2] Fast and memory-efficient algorithms for high-order Tucker decomposition
Jiyuan Zhang
Jinoh Oh
Kijung Shin
Evangelos E. Papalexakis
Christos Faloutsos
Hwanjo Yu
[J]. Knowledge and Information Systems, 2020, 62 : 2765 - 2794
[3] Fast and memory-efficient algorithms for high-order Tucker decomposition
Zhang, Jiyuan
Oh, Jinoh
Shin, Kijung
Papalexakis, Evangelos E.
Faloutsos, Christos
Yu, Hwanjo
[J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (07) : 2765 - 2794
[4] Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition
Broadbent, Charles
Song, Tianci
Kuang, Rui
[J]. BIOINFORMATICS, 2024, 40 : i529 - i538
[5] HOQRI: HIGHER-ORDER QR ITERATION FOR SCALABLE TUCKER DECOMPOSITION
Sun, Yuchen
Huang, Kejun
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3648 - 3652
[6] Singleshot : a scalable Tucker tensor decomposition
Traore, Abraham
Berar, Maxime
Rakotomamonjy, Alain
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[7] Scalable Computation of High-Order Optimization Queries
Brucato, Matteo
Abouzied, Azza
Meliou, Alexandra
[J]. COMMUNICATIONS OF THE ACM, 2019, 62 (02) : 108 - 116
[8] Scalable Distributed High-Order Stencil Computations
Jacquelin, Mathias
Araya-Polo, Mauricio
Meng, Jie
[J]. SC22: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2022,
[9] Scalable High-Order Gaussian Process Regression
Zhe, Shandian
Xing, Wei
Kirby, Robert M.
[J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[10] Scalable Nonparametric Factorization for High-Order Interaction Events
Pan, Zhimeng
Wang, Zheng
Zhe, Shandian
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 4325 - 4334

← 1 2 3 4 5 →