Fast and Robust Parallel SGD Matrix Factorization

被引:30
|
作者
Oh, Jinoh [1 ]
Han, Wook-Shin [1 ]
Yu, Hwanjo [1 ]
Jiang, Xiaoqian [2 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Pohang, South Korea
[2] UCSD, La Jolla, CA USA
来源
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年
基金
新加坡国家研究基金会;
关键词
Matrix factorization; Stochastic gradient descent;
D O I
10.1145/2783258.2783322
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Matrix factorization is one of the fundamental techniques for analyzing latent relationship between two entities. Especially, it is used for recommendation for its high accuracy. Efficient parallel SGD matrix factorization algorithms have been developed for large matrices to speed up the convergence of factorization. However, most of them are designed for a shared-memory environment thus fail to factorize a large matrix that is too big to fit in memory, and their performances are also unreliable when the matrix is skewed. This paper proposes a fast and robust parallel SGD matrix factorization algorithm, called MLGF-MF, which is robust to skewed matrices and runs efficiently on block-storage devices (e.g., SSD disks) as well as shared-memory. MLGF-MF uses Multi-Level Grid File (MLGF) for partitioning the matrix and minimizes the cost for scheduling parallel SGD updates on the partitioned regions by exploiting partial match queries processing. Thereby, MLGF-MF produces reliable results efficiently even on skewed matrices. MLGF-MF is designed with asynchronous I/O permeated in the algorithm such that CPU keeps executing without waiting for I/O to complete. Thereby, MLGF-MF overlaps the CPU and I/O processing, which eventually offsets the I/O cost and maximizes the CPU utility. Recent flash SSD disks support high performance parallel I/O, thus are appropriate for executing the asynchronous I/O. From our extensive evaluations, MLGF-MF significantly outperforms (or converges faster than) the state-of-the-art algorithms in both shared-memory and block-storage environments. In addition, the outputs of MLGF-MF is significantly more robust to skewed matrices. Our implementation of MLGF-MF is available at http ://dm.postech.ac.kr/MLGF-MF as executable files.
引用
收藏
页码:865 / 874
页数:10
相关论文
共 50 条
  • [21] Optimal Topology Search for Fast Model Averaging in Decentralized Parallel SGD
    Jameel, Mohsan
    Jawed, Shayan
    Schmidt-Thieme, Lars
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II, 2020, 12085 : 894 - 905
  • [22] FAST PARALLEL ALGORITHMS FOR QR AND TRIANGULAR FACTORIZATION
    CHUN, J
    KAILATH, T
    LEVARI, H
    SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1987, 8 (06): : 899 - 913
  • [23] Parallel approximate matrix factorization for kernel methods
    Zhu, Kaihua
    Cui, Hang
    Bai, Hongjie
    Li, Jian
    Qiu, Zhihuan
    Wang, Hao
    Xu, Hui
    Chang, Edward Y.
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 1275 - 1278
  • [24] Sparse Matrix Factorization on Massively Parallel Computers
    Gupta, Anshul
    Koric, Seid
    George, Thomas
    PROCEEDINGS OF THE CONFERENCE ON HIGH PERFORMANCE COMPUTING NETWORKING, STORAGE AND ANALYSIS, 2009,
  • [25] Parallel Cholesky factorization of a block tridiagonal matrix
    Cao, TD
    Hall, JF
    van de Geijn, RA
    2002 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS OF THE WORKSHOPS, 2002, : 327 - 335
  • [26] Parallel Nonnegative Matrix Factorization with Manifold Regularization
    Liu, Fudong
    Shan, Zheng
    Chen, Yihang
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2018, 2018
  • [27] Scalable Robust Matrix Factorization with Nonconvex Loss
    Yao, Quanming
    Kwok, James T.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [28] Robust Asymmetric Bayesian Adaptive Matrix Factorization
    Guo, Xin
    Pan, Boyuan
    Cai, Deng
    He, Xiaofei
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1760 - 1766
  • [29] Robust Graph Regularized Nonnegative Matrix Factorization
    Huang, Qi
    Zhang, Guodao
    Yin, Xuesong
    Wang, Yigang
    IEEE ACCESS, 2022, 10 : 86962 - 86978
  • [30] ROBUST PRINCIPAL COMPONENT ANALYSIS WITH MATRIX FACTORIZATION
    Chen, Yongyong
    Zhou, Yicong
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2411 - 2415