Flexible Low-Rank Statistical Modeling with Missing Data and Side Information

被引:16
|
作者
Fithian, William [1 ]
Mazumder, Rahul [2 ,3 ]
机构
[1] Univ Calif Berkeley, Dept Stat, 301 Evans Hall, Berkeley, CA 94720 USA
[2] MIT, Sloan Sch Management, Operat Res Ctr, Bldg E62-583,100 Main St, Cambridge, MA 02142 USA
[3] MIT, Ctr Stat, Bldg E62-583,100 Main St, Cambridge, MA 02142 USA
关键词
Matrix completion; nuclear norm regularization; matrix factorization; convex optimization; missing data; MATRIX COMPLETION; MINIMIZATION; ALGORITHMS; SHRINKAGE; VALUES; NORM;
D O I
10.1214/18-STS642
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We explore a general statistical framework for low-rank modeling of matrix-valued data, based on convex optimization with a generalized nuclear norm penalty. We study several related problems: the usual low-rank matrix completion problem with flexible loss functions arising from generalized linear models; reduced-rank regression and multi-task learning; and generalizations of both problems where side information about rows and columns is available, in the form of features or smoothing kernels. We show that our approach encompasses maximum a posteriori estimation arising from Bayesian hierarchical modeling with latent factors, and discuss ramifications of the missing-data mechanism in the context of matrix completion. While the above problems can be naturally posed as rank-constrained optimization problems, which are nonconvex and computationally difficult, we show how to relax them via generalized nuclear norm regularization to obtain convex optimization problems. We discuss algorithms drawing inspiration from modern convex optimization methods to address these large scale convex optimization computational tasks. Finally, we illustrate our flexible approach in problems arising in functional data reconstruction and ecological species distribution modeling.
引用
收藏
页码:238 / 260
页数:23
相关论文
共 50 条
  • [1] STRUCTURED LOW-RANK APPROXIMATION WITH MISSING DATA
    Markovsky, Ivan
    Usevich, Konstantin
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2013, 34 (02) : 814 - 830
  • [2] Using Side Information to Reliably Learn Low-Rank Matrices from Missing and Corrupted Observations
    Chiang, Kai-Yang
    Dhillon, Inderjit S.
    Hsieh, Cho-Jui
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 19
  • [3] Imputation and low-rank estimation with Missing Not At Random data
    Aude Sportisse
    Claire Boyer
    Julie Josse
    Statistics and Computing, 2020, 30 : 1629 - 1643
  • [4] Imputation and low-rank estimation with Missing Not At Random data
    Sportisse, Aude
    Boyer, Claire
    Josse, Julie
    STATISTICS AND COMPUTING, 2020, 30 (06) : 1629 - 1643
  • [5] Dealing with missing information in data envelopment analysis by means of low-rank matrix completion
    Leonardo Tomazeli Duarte
    Alex Pincelli Mussio
    Cristiano Torezzan
    Annals of Operations Research, 2020, 286 : 719 - 732
  • [6] Dealing with missing information in data envelopment analysis by means of low-rank matrix completion
    Duarte, Leonardo Tomazeli
    Mussio, Alex Pincelli
    Torezzan, Cristiano
    ANNALS OF OPERATIONS RESEARCH, 2020, 286 (1-2) : 719 - 732
  • [7] Low-rank model with covariates for count data with missing values
    Robin, Genevieve
    Josse, Julie
    Moulines, Eric
    Sardy, Sylvain
    JOURNAL OF MULTIVARIATE ANALYSIS, 2019, 173 : 416 - 434
  • [8] Algorithms and Literate Programs for Weighted Low-Rank Approximation with Missing Data
    Markovsky, Ivan
    APPROXIMATION ALGORITHMS FOR COMPLEX SYSTEMS, 2011, 3 : 255 - 273
  • [9] Clustering a union of low-rank subspaces of different dimensions with missing data
    Ashraphijuo, Morteza
    Wang, Xiaodong
    PATTERN RECOGNITION LETTERS, 2019, 120 : 31 - 35
  • [10] LOW-RANK DATA MATRIX RECOVERY WITH MISSING VALUES AND FAULTY SENSORS
    Lopez-Valcarce, Roberto
    Sala-Alvarez, Josep
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,