Spark-based Large-scale Matrix Inversion for Big Data Processing

被引:0
|
作者
Liang, Yang [1 ]
Liu, Jun [1 ]
Fang, Cheng [1 ]
Ansari, Nirwan [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Ctr Data Sci, Beijing 100876, Peoples R China
[2] New Jersey Inst Technol, Elect & Comp Engn Dept, Newark, NJ 07102 USA
关键词
matrix inversion; LU decomposition; linear algebra; parallel algorithm; distributed computing; Spark;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Matrix inversion is a fundamental operation to solve linear equations for many computational applications. However, it is a challenging task to invert large-scale matrices of extremely high order (several thousands), which are common in most of web-scale systems like social networks and recommendation systems. In this paper, we present a LU decomposition based block-recursive algorithm for large-scale matrix inversion, and its well-designed implementation with optimized data structure, reduction of space complexity and effective matrix multiplication on the Spark parallel computing platform. The experimental evaluation results show that the proposed algorithm is efficient to invert large-scale matrices on a cluster composed of commodity servers and scalable to invert even larger matrices. The proposed algorithm and implementation will be a solid base to build a high-performance linear algebra library on Spark for big data processing.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Spark-Based Large-Scale Matrix Inversion for Big Data Processing
    Liu, Jun
    Liang, Yang
    Ansari, Nirwan
    [J]. IEEE ACCESS, 2016, 4 : 2166 - 2176
  • [2] A Spark-based Artificial Bee Colony Algorithm for Large-scale Data Clustering
    Wang, Yanjie
    Qian, Quan
    [J]. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 1213 - 1218
  • [3] A Spark-Based Big Data Platform for Massive Remote Sensing Data Processing
    Sun, Zhongyi
    Chen, Fengke
    Chi, Mingmin
    Zhu, Yangyong
    [J]. DATA SCIENCE, 2015, 9208 : 120 - 126
  • [4] Processing large-scale data with Apache Spark
    Ko, Seyoon
    Won, Joong-Ho
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (06) : 1077 - 1094
  • [5] Spark-SIFT: A Spark-Based Large-Scale Image Feature Extract System
    Zhang, Xinming
    Yang, YaoHua
    Shen, Li
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG 2017), 2017, : 69 - 76
  • [6] A spark-based method for identifying large-scale network burst traffic
    Sun, Yu-Lu
    Yun, Ben-Sheng
    Qian, Ya-Guan
    Feng, Jun
    [J]. Journal of Computers (Taiwan), 2021, 32 (04) : 123 - 136
  • [7] Efficient Spark-Based Framework for Big Geospatial Data Query Processing and Analysis
    Aljawarneh, Isam Mashhour
    Bellavista, Paolo
    Corradi, Antonio
    Montanari, Rebecca
    Foschini, Luca
    Zanotti, Andrea
    [J]. 2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2017, : 851 - 856
  • [8] An adaptive spark-based framework for querying large-scale NoSQL and relational databases
    Khashan, Eman
    Eldesouky, Ali
    Elghamrawy, Sally
    [J]. PLOS ONE, 2021, 16 (08):
  • [9] HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing
    Krishan Kumar Sethi
    Dharavath Ramesh
    [J]. The Journal of Supercomputing, 2017, 73 : 3652 - 3668
  • [10] HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing
    Sethi, Krishan Kumar
    Ramesh, Dharavath
    [J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (08): : 3652 - 3668