TuckerMPI: A Parallel C plus plus /MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

被引:24
|
作者
Ballard, Grey [1 ]
Klinvex, Alicia [2 ]
Kolda, Tamara G. [2 ]
机构
[1] Wake Forest Univ, Dept Comp Sci, Winston Salem, NC 27109 USA
[2] Sandia Natl Labs, Livermore, CA 94551 USA
来源
基金
美国国家科学基金会;
关键词
Tucker decomposition; tensor decomposition; higher-order singular value decomposition (HOSVD); COLLECTIVE COMMUNICATION; TRUNCATION;
D O I
10.1145/3378445
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI software package for compressing distributed data. The approach is based on treating the data as a tensor, i.e., a multidimensional array, and computing its truncated Tucker decomposition, a higher-order analogue to the truncated singular value decomposition of a matrix. The result is a low-rank approximation of the original tensor-structured data. Compression efficiency is achieved by detecting latent global structure within the data, which we contrast to most compression methods that are focused on local structure. In this work, we describe TuckerMPI, our implementation of the truncated Tucker decomposition, including details of the data distribution and in-memory layouts, the parallel and serial implementations of the key kernels, and analysis of the storage, communication, and computational costs. We test the software on 4.5 and 6.7 terabyte data sets distributed across 100 s of nodes (1,000 s of MPI processes), achieving compression ratios between 100 and 200,000x, which equates to 99-99.999% compression (depending on the desired accuracy) in substantially less time than it would take to even read the same dataset from a parallel file system. Moreover, we show that our method also allows for reconstruction of partial or down-sampled data on a single node, without a parallel computer so long as the reconstructed portion is small enough to fit on a single machine, e.g., in the instance of reconstructing/visualizing a single down-sampled time step or computing summary statistics. The code is available at https://gitlab.com/tensors/TuckerMPI.
引用
收藏
页数:31
相关论文
共 50 条
  • [1] Parallel Tensor Compression for Large-Scale Scientific Data
    Austin, Woody
    Ballard, Grey
    Kolda, Tamara G.
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2016), 2016, : 912 - 922
  • [2] Large-scale tucker Tensor factorization for sparse and accurate decomposition
    Jang, Jun-Gi
    Park, Moonjeong
    Lee, Jongwuk
    Sael, Lee
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (16): : 17992 - 18022
  • [3] Large-scale tucker Tensor factorization for sparse and accurate decomposition
    Jun-Gi Jang
    Moonjeong Park
    Jongwuk Lee
    Lee Sael
    [J]. The Journal of Supercomputing, 2022, 78 : 17992 - 18022
  • [4] GPUTucker: Large-Scale GPU-Based Tucker Decomposition Using Tensor Partitioning
    Lee, Jihye
    Han, Donghyoung
    Kwon, Oh-Kyoung
    Chon, Kang-Wook
    Kim, Min-Soo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [5] An Interactive Reverse Engineering Environment for Large-Scale C plus plus Code
    Telea, Alexandru
    Voinea, Lucian
    [J]. SOFTVIS 2008: PROCEEDINGS OF THE 4TH ACM SYMPOSIUM ON SOFTWARE VISUALIZATION, 2008, : 67 - 76
  • [6] Hardware Acceleration in Large-Scale Tensor Decomposition for Neural Network Compression
    Kao, Chen-Chien
    Hsieh, Yi-Yen
    Chen, Chao-Hung
    Yang, Chia-Hsiang
    [J]. 2022 IEEE 65TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS 2022), 2022,
  • [7] Scalable and Robust Tensor Ring Decomposition for Large-scale Data
    He, Yicong
    Atia, George K.
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 860 - 869
  • [8] Optimization strategies using hybrid MPI plus OpenMP parallelization for large-scale data visualization on Earth Simulator
    Chen, Li
    Fujishiro, Issei
    [J]. PRACTICAL PROGRAMMING MODEL FOR THE MULTI-CORE ERA, PROCEEDINGS, 2008, 4935 : 112 - +
  • [9] Infrared Image Monitoring Data Compression of Power Distribution Network via Tensor Tucker Decomposition
    Zhao, Hongshan
    Feng, Jiahao
    Ma, Libo
    [J]. Dianwang Jishu/Power System Technology, 2021, 45 (04): : 1632 - 1639
  • [10] Automated Fortran-C plus plus Bindings for Large-Scale Scientific Applications
    Johnson, Seth R.
    Prokopenko, Andrey
    Evans, Katherine J.
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2020, 22 (05) : 84 - 93