Compressed linear algebra for large-scale machine learning

被引：13

作者：

Elgohary, Ahmed ^{[2
]}

Boehm, Matthias ^{[1
]}

Haas, Peter J. ^{[1
]}

Reiss, Frederick R. ^{[1
]}

Reinwald, Berthold ^{[1
]}

机构：

[1] IBM Res Almaden, San Jose, CA 95120 USA

[2] Univ Maryland, College Pk, MD 20742 USA

来源：

VLDB JOURNAL | 2018年 / 27卷 / 05期

关键词：

Machine learning; Large-scale; Declarative; Linear algebra; Lossless compression; DATABASE; FACTORIZATION; DB2;

D O I：

10.1007/s00778-017-0478-1

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Large-scale machine learning algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications to converge to an optimal model. It is crucial for performance to fit the data into single-node or distributed main memory and enable fast matrix-vector operations on in-memory data. General-purpose, heavy- and lightweight compression techniques struggle to achieve both good compression ratios and fast decompression speed to enable block-wise uncompressed operations. Therefore, we initiate work-inspired by database compression and sparse matrix formats-on value-based compressed linear algebra (CLA), in which heterogeneous, lightweight database compression techniques are applied to matrices, and then linear algebra operations such as matrix-vector multiplication are executed directly on the compressed representation. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show that CLA achieves in-memory operations performance close to the uncompressed case and good compression ratios, which enables fitting substantially larger datasets into available memory. We thereby obtain significant end-to-end performance improvements up to .

引用

页码：719 / 744

页数：26

共 50 条

[1] Compressed Linear Algebra for Large-Scale Machine Learning
Elgohary, Ahmed
Boehm, Matthias
Haas, Peter J.
Reiss, Frederick R.
Reinwald, Berthold
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (12): : 960 - 971
[2] Compressed linear algebra for large-scale machine learning
Ahmed Elgohary
Matthias Boehm
Peter J. Haas
Frederick R. Reiss
Berthold Reinwald
[J]. The VLDB Journal, 2018, 27 : 719 - 744
[3] Compressed Linear Algebra for Declarative Large-Scale Machine Learning
Elgohary, Ahmed
Boehm, Matthias
Haas, Peter J.
Reiss, Frederick R.
Reinwald, Berthold
[J]. COMMUNICATIONS OF THE ACM, 2019, 62 (05) : 83 - 91
[4] Scaling Machine Learning via Compressed Linear Algebra
Elgohary, Ahmed
Boehm, Matthias
Haas, Peter J.
Reiss, Frederick R.
Reinwald, Berthold
[J]. SIGMOD RECORD, 2017, 46 (01) : 42 - 49
[5] Technical Perspective: Scaling Machine Learning via Compressed Linear Algebra
Ives, Zachary G.
[J]. SIGMOD RECORD, 2017, 46 (01) : 41 - 41
[6] A Survey on Large-Scale Machine Learning
Wang, Meng
Fu, Weijie
He, Xiangnan
Hao, Shijie
Wu, Xindong
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (06) : 2574 - 2594
[7] Linear algebra software for large-scale accerlerated multicore computing
Abdelfatah, A.
Anzt, H.
Dongarra, J.
Gates, M.
Haidar, A.
Kurzak, J.
Luszczek, P.
Tomov, S.
Yamazaki, I.
YarKhan, A.
[J]. ACTA NUMERICA, 2016, 25 : 1 - 160
[8] Optimizing Sparse Linear Algebra for Large-Scale Graph Analytics
Buono, Daniele
Gunnels, John A.
Que, Xinyu
Checconi, Fabio
Petrini, Fabrizio
Tuan, Tai-Ching
Long, Chris
[J]. COMPUTER, 2015, 48 (08) : 26 - 34
[9] Large-scale distributed linear algebra with tensor processing units
Lewis, Adam G. M.
Beall, Jackson
Ganahl, Martin
Hauru, Markus
Mallick, Shrestha Basu
Vidal, Guifre
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (33)
[10] Efficient Machine Learning On Large-Scale Graphs
Erickson, Parker
Lee, Victor E.
Shi, Feng
Tang, Jiliang
[J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4788 - 4789

← 1 2 3 4 5 →