Spark-based Large-scale Matrix Inversion for Big Data Processing

被引：0

作者：

Liang, Yang ^{[1
]}

Liu, Jun ^{[1
]}

Fang, Cheng ^{[1
]}

Ansari, Nirwan ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Ctr Data Sci, Beijing 100876, Peoples R China

[2] New Jersey Inst Technol, Elect & Comp Engn Dept, Newark, NJ 07102 USA

来源：

2016 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS) | 2016年

关键词：

matrix inversion; LU decomposition; linear algebra; parallel algorithm; distributed computing; Spark;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Matrix inversion is a fundamental operation to solve linear equations for many computational applications. However, it is a challenging task to invert large-scale matrices of extremely high order (several thousands), which are common in most of web-scale systems like social networks and recommendation systems. In this paper, we present a LU decomposition based block-recursive algorithm for large-scale matrix inversion, and its well-designed implementation with optimized data structure, reduction of space complexity and effective matrix multiplication on the Spark parallel computing platform. The experimental evaluation results show that the proposed algorithm is efficient to invert large-scale matrices on a cluster composed of commodity servers and scalable to invert even larger matrices. The proposed algorithm and implementation will be a solid base to build a high-performance linear algebra library on Spark for big data processing.

引用

页数：6

共 50 条

[1] Spark-Based Large-Scale Matrix Inversion for Big Data Processing
Liu, Jun
Liang, Yang
Ansari, Nirwan
[J]. IEEE ACCESS, 2016, 4 : 2166 - 2176
[2] A Spark-based Artificial Bee Colony Algorithm for Large-scale Data Clustering
Wang, Yanjie
Qian, Quan
[J]. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 1213 - 1218
[3] A Spark-Based Big Data Platform for Massive Remote Sensing Data Processing
Sun, Zhongyi
Chen, Fengke
Chi, Mingmin
Zhu, Yangyong
[J]. DATA SCIENCE, 2015, 9208 : 120 - 126
[4] Processing large-scale data with Apache Spark
Ko, Seyoon
Won, Joong-Ho
[J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (06) : 1077 - 1094
[5] Spark-SIFT: A Spark-Based Large-Scale Image Feature Extract System
Zhang, Xinming
Yang, YaoHua
Shen, Li
[J]. 2017 13TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG 2017), 2017, : 69 - 76
[6] A spark-based method for identifying large-scale network burst traffic
Sun, Yu-Lu
Yun, Ben-Sheng
Qian, Ya-Guan
Feng, Jun
[J]. Journal of Computers (Taiwan), 2021, 32 (04) : 123 - 136
[7] Efficient Spark-Based Framework for Big Geospatial Data Query Processing and Analysis
Aljawarneh, Isam Mashhour
Bellavista, Paolo
Corradi, Antonio
Montanari, Rebecca
Foschini, Luca
Zanotti, Andrea
[J]. 2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2017, : 851 - 856
[8] An adaptive spark-based framework for querying large-scale NoSQL and relational databases
Khashan, Eman
Eldesouky, Ali
Elghamrawy, Sally
[J]. PLOS ONE, 2021, 16 (08):
[9] HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing
Krishan Kumar Sethi
Dharavath Ramesh
[J]. The Journal of Supercomputing, 2017, 73 : 3652 - 3668
[10] HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing
Sethi, Krishan Kumar
Ramesh, Dharavath
[J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (08): : 3652 - 3668

← 1 2 3 4 5 →