Flexible Low-Rank Statistical Modeling with Missing Data and Side Information

被引:16
|
作者
Fithian, William [1 ]
Mazumder, Rahul [2 ,3 ]
机构
[1] Univ Calif Berkeley, Dept Stat, 301 Evans Hall, Berkeley, CA 94720 USA
[2] MIT, Sloan Sch Management, Operat Res Ctr, Bldg E62-583,100 Main St, Cambridge, MA 02142 USA
[3] MIT, Ctr Stat, Bldg E62-583,100 Main St, Cambridge, MA 02142 USA
关键词
Matrix completion; nuclear norm regularization; matrix factorization; convex optimization; missing data; MATRIX COMPLETION; MINIMIZATION; ALGORITHMS; SHRINKAGE; VALUES; NORM;
D O I
10.1214/18-STS642
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We explore a general statistical framework for low-rank modeling of matrix-valued data, based on convex optimization with a generalized nuclear norm penalty. We study several related problems: the usual low-rank matrix completion problem with flexible loss functions arising from generalized linear models; reduced-rank regression and multi-task learning; and generalizations of both problems where side information about rows and columns is available, in the form of features or smoothing kernels. We show that our approach encompasses maximum a posteriori estimation arising from Bayesian hierarchical modeling with latent factors, and discuss ramifications of the missing-data mechanism in the context of matrix completion. While the above problems can be naturally posed as rank-constrained optimization problems, which are nonconvex and computationally difficult, we show how to relax them via generalized nuclear norm regularization to obtain convex optimization problems. We discuss algorithms drawing inspiration from modern convex optimization methods to address these large scale convex optimization computational tasks. Finally, we illustrate our flexible approach in problems arising in functional data reconstruction and ecological species distribution modeling.
引用
收藏
页码:238 / 260
页数:23
相关论文
共 50 条
  • [41] Classification with Low Rank and Missing Data
    Hazan, Elad
    Livni, Roi
    Mansour, Yishay
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 257 - 266
  • [42] A Novel Interpolation-SVT Approach for Recovering Missing Low-Rank Air Quality Data
    Yu, Yangwen
    Yu, James J. Q.
    Li, Victor O. K.
    Lam, Jacqueline C. K.
    IEEE ACCESS, 2020, 8 (08): : 74291 - 74305
  • [43] AUTOREGRESSION AND STRUCTURED LOW-RANK MODELING OF SINOGRAMS
    Lobos, Rodrigo A.
    Leahy, Richard M.
    Haldar, Justin P.
    2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 178 - 181
  • [44] Low-Rank Characteristic and Temporal Correlation Analytics for Incipient Industrial Fault Detection With Missing Data
    Yu, Wanke
    Zhao, Chunhui
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (09) : 6337 - 6346
  • [45] Interpolation method of traffic volume missing data based on improved low-rank matrix completion
    Chen, Xiao-Bo
    Chen, Cheng
    Chen, Lei
    Wei, Zhong-Jie
    Cai, Ying-Feng
    Zhou, Jun-Jie
    Jiaotong Yunshu Gongcheng Xuebao/Journal of Traffic and Transportation Engineering, 2019, 19 (05): : 180 - 190
  • [46] Robust low-rank data matrix approximations
    XingDong Feng
    XuMing He
    Science China Mathematics, 2017, 60 : 189 - 200
  • [47] On the low-rank approximation of data on the unit sphere
    Chu, M
    Del Buono, N
    Lopez, L
    Politi, T
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2005, 27 (01) : 46 - 60
  • [48] Robust low-rank data matrix approximations
    Feng XingDong
    He XuMing
    SCIENCE CHINA-MATHEMATICS, 2017, 60 (02) : 189 - 200
  • [49] Matrix recovery with implicitly low-rank data
    Xie, Xingyu
    Wu, Jianlong
    Liu, Guangcan
    Wang, Jun
    NEUROCOMPUTING, 2019, 334 : 219 - 226
  • [50] Robust low-rank data matrix approximations
    FENG XingDong
    HE XuMing
    ScienceChina(Mathematics), 2017, 60 (02) : 189 - 200