SGD_Tucker: A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition

被引:10
|
作者
Li, Hao [1 ,2 ,3 ]
Li, Zixuan [1 ,2 ]
Li, Kenli [1 ,2 ]
Rellermeyer, Jan S. [3 ]
Chen, Lydia Y. [3 ]
Li, Keqin [1 ,2 ,4 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Hunan, Peoples R China
[2] Natl Supercomp Ctr, Changsha 410082, Hunan, Peoples R China
[3] Delft Univ Technol, NL-2628 CD Delft, Netherlands
[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
基金
瑞士国家科学基金会; 中国国家自然科学基金;
关键词
Tensors; Sparse matrices; Optimization; Stochastic processes; Matrix decomposition; Indexes; Data models; High-order; high-dimension and sparse tensor; low-rank representation learning; machine learning algorithm; sparse tucker decomposition; stochastic optimization; parallel strategy; FACTORIZATION; REDUCTION; NETWORKS;
D O I
10.1109/TPDS.2020.3047460
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD_Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD_Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD_Tucker features the two distinct advancements over the state of the art. First, SGD_Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD_Tucker enables fine-grained parallelization, which makes SGD_Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD_Tucker runs at least 2X faster than the state of the art.
引用
收藏
页码:1828 / 1841
页数:14
相关论文
共 50 条
  • [1] High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors
    Kaya, Oguz
    Ucar, Bora
    [J]. PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, : 103 - 112
  • [2] Sparse Symmetric Format for Tucker Decomposition
    Shivakumar, Shruti
    Li, Jiajia
    Kannan, Ramakrishnan
    Aluru, Srinivas
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (06) : 1743 - 1756
  • [3] Optimization landscape of Tucker decomposition
    Frandsen, Abraham
    Ge, Rong
    [J]. MATHEMATICAL PROGRAMMING, 2022, 193 (02) : 687 - 712
  • [4] Optimization landscape of Tucker decomposition
    Abraham Frandsen
    Rong Ge
    [J]. Mathematical Programming, 2022, 193 : 687 - 712
  • [5] PARALLEL RANDOMIZED TUCKER DECOMPOSITION ALGORITHMS
    Minster, Rachel
    Li, Zitong
    Ballard, Grey
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2024, 46 (02): : A1186 - A1213
  • [6] Efficient Parallel Sparse Symmetric Tucker Decomposition for High-Order Tensors
    Shivakumar, Shruti
    Li, Jiajia
    Kannan, Ramakrishnan
    Aluru, Srinivas
    [J]. PROCEEDINGS OF THE 2021 SIAM CONFERENCE ON APPLIED AND COMPUTATIONAL DISCRETE ALGORITHMS, ACDA21, 2021, : 193 - 204
  • [7] On Optimizing Distributed Tucker Decomposition for Sparse Tensors
    Chakaravarthy, Venkatesan T.
    Choi, Jee W.
    Joseph, Douglas J.
    Murali, Prakash
    Pandian, Shivmaran S.
    Sabharwal, Yogish
    Sreedhar, Dheeraj
    [J]. INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2018), 2018, : 374 - 384
  • [8] Accelerating the Tucker Decomposition with Compressed Sparse Tensors
    Smith, Shaden
    Karypis, George
    [J]. EURO-PAR 2017: PARALLEL PROCESSING, 2017, 10417 : 653 - 668
  • [9] TENSOR DICTIONARY LEARNING WITH SPARSE TUCKER DECOMPOSITION
    Zubair, Syed
    Wang, Wenwu
    [J]. 2013 18TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2013,
  • [10] Parallel Tucker Decomposition with Numerically Accurate SVD
    Li, Zitong
    Fang, Qiming
    Ballard, Grey
    [J]. 50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,