Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems

被引:0
|
作者
Hye-Kyung Yang [1 ]
Hwan-Seung Yong [2 ]
机构
[1] Department of Computer Software, Korean Bible University
[2] Department of Computer Science and Engineering, Ewha Womans University
基金
新加坡国家研究基金会;
关键词
PARAFAC; Tensor decomposition; Incremental tensor decomposition; Apache Spark; Big data;
D O I
暂无
中图分类号
TP311.13 []; O183.2 [张量分析];
学科分类号
1201 ;
摘要
Purpose: We propose In Par Ten2, a multi-aspect parallel factor analysis three-dimensional tensor decomposition algorithm based on the Apache Spark framework. The proposed method reduces re-decomposition cost and can handle large tensors.Design/methodology/approach: Considering that tensor addition increases the size of a given tensor along all axes, the proposed method decomposes incoming tensors using existing decomposition results without generating sub-tensors. Additionally, In Par Ten2 avoids the calculation of Khari–Rao products and minimizes shuffling by using the Apache Spark platform.Findings: The performance of In Par Ten2 is evaluated by comparing its execution time and accuracy with those of existing distributed tensor decomposition methods on various datasets.The results confirm that In Par Ten2 can process large tensors and reduce the re-calculation cost of tensor decomposition. Consequently, the proposed method is faster than existing tensor decomposition algorithms and can significantly reduce re-decomposition cost.Research limitations: There are several Hadoop-based distributed tensor decomposition algorithms as well as MATLAB-based decomposition methods. However, the former require longer iteration time, and therefore their execution time cannot be compared with that of Spark-based algorithms, whereas the latter run on a single machine, thus limiting their ability to handle large data.Practical implications: The proposed algorithm can reduce re-decomposition cost when tensors are added to a given tensor by decomposing them based on existing decomposition results without re-decomposing the entire tensor. Originality/value: The proposed method can handle large tensors and is fast within the limited-memory framework of Apache Spark. Moreover, In Par Ten2 can handle static as well as incremental tensor decomposition.
引用
收藏
页码:13 / 32
页数:20
相关论文
共 50 条
  • [41] Multi-Aspect and Multi-Class Based Document Sentiment Analysis of Educational Data Catering Accreditation Process
    Valakunde, N. D.
    Patwardhan, M. S.
    2013 INTERNATIONAL CONFERENCE ON CLOUD & UBIQUITOUS COMPUTING & EMERGING TECHNOLOGIES (CUBE 2013), 2013, : 188 - 192
  • [42] Big Data Matrix Singular Value Decomposition Based on Low-Rank Tensor Train Decomposition
    Lee, Namgil
    Cichocki, Andrzej
    ADVANCES IN NEURAL NETWORKS - ISNN 2014, 2014, 8866 : 121 - 130
  • [43] A multi-aspect user-interest model based on sentiment analysis and uncertainty theory for recommender systems
    Sun, Lihua
    Guo, Junpeng
    Zhu, Yanlin
    ELECTRONIC COMMERCE RESEARCH, 2020, 20 (04) : 857 - 882
  • [44] A multi-aspect user-interest model based on sentiment analysis and uncertainty theory for recommender systems
    Lihua Sun
    Junpeng Guo
    Yanlin Zhu
    Electronic Commerce Research, 2020, 20 : 857 - 882
  • [45] A Novel Scalable Kernelized Fuzzy Clustering Algorithms Based on In-Memory Computation for Handling Big Data
    Jha, Preeti
    Tiwari, Aruna
    Bharill, Neha
    Ratnaparkhe, Milind
    Mounika, Mukkamalla
    Nagendra, Neha
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (06): : 908 - 919
  • [46] Benchmark Testing for Transwarp Inceptor-A big data analysis system based on in-memory computing
    Chen, Mingang
    Chen, Zhenqiang
    Liu, Wanggen
    Liu, Zhengyu
    PROCEEDINGS OF THE 2015 4TH NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING ( NCEECE 2015), 2016, 47 : 279 - 283
  • [47] Management and Analytic of Biomedical Big Data with Cloud-based In-Memory Database and Dynamic Querying
    Feng, Mengling
    Ghassemi, Mohammad
    Brennan, Thomas
    Ellenberger, John
    Hussain, Ishrar
    Mark, Roger
    PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 1970 - 1970
  • [48] MATS: A Multi-aspect and Adaptive Trust-based Situation-aware Access Control Framework for Federated Data-as-a-Service Systems
    Kim, Dae-Young
    Alodadi, Nujood
    Chen, Zhiyuan
    Joshi, Karuna P.
    Crainiceanu, Adina
    Needham, Don
    2022 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (IEEE SCC 2022), 2022, : 54 - 64
  • [49] Learning Inter- and Intra-Manifolds for Matrix Factorization-Based Multi-Aspect Data Clustering
    Luong, Khanh
    Nayak, Richi
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (07) : 3349 - 3362
  • [50] A Distributed HOSVD Method With Its Incremental Computation for Big Data in Cyber-Physical-Social Systems
    Wang, Xiaokang
    Wang, Wei
    Yang, Laurence T.
    Liao, Siwei
    Yin, Dexiang
    Deen, M. Jamal
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2018, 5 (02): : 481 - 492