Efficient, certifiably optimal clustering with applications to latent variable graphical models

被引:0
|
作者
Carson Eisenach
Han Liu
机构
[1] Princeton University,Department of Operations Research and Financial Engineering
[2] Northwestern University,Department of Electrical Engineering and Computer Science
来源
Mathematical Programming | 2019年 / 176卷
关键词
90C22; 90C35; 90C90; 62H30;
D O I
暂无
中图分类号
学科分类号
摘要
Motivated by the task of clustering either d variables or d points into K groups, we investigate efficient algorithms to solve the Peng–Wei (P–W) K-means semi-definite programming (SDP) relaxation. The P–W SDP has been shown in the literature to have good statistical properties in a variety of settings, but remains intractable to solve in practice. To this end we propose FORCE, a new algorithm to solve this SDP relaxation. Compared to off-the-shelf interior point solvers, our method reduces the computational complexity of solving the SDP from O~(d7logϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^7\log \epsilon ^{-1})$$\end{document} to O~(d6K-2ϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^{6}K^{-2}\epsilon ^{-1})$$\end{document} arithmetic operations for an ϵ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\epsilon $$\end{document}-optimal solution. Our method combines a primal first-order method with a dual optimality certificate search, which when successful, allows for early termination of the primal method. We show for certain variable clustering problems that, with high probability, FORCE is guaranteed to find the optimal solution to the SDP relaxation and provide a certificate of exact optimality. As verified by our numerical experiments, this allows FORCE to solve the P–W SDP with dimensions in the hundreds in only tens of seconds. For a variation of the P–W SDP where K is not known a priori a slight modification of FORCE reduces the computational complexity of solving this problem as well: from O~(d7logϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^7\log \epsilon ^{-1})$$\end{document} using a standard SDP solver to O~(d4ϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^{4}\epsilon ^{-1})$$\end{document}.
引用
收藏
页码:137 / 173
页数:36
相关论文
共 50 条
  • [21] APPLICATIONS OF LATENT VARIABLE MODELS IN BEHAVIORAL MEDICINE RESEARCH
    Llabre, Maria M.
    Arguelles, William
    INTERNATIONAL JOURNAL OF BEHAVIORAL MEDICINE, 2014, 21 : S2 - S2
  • [22] Hierarchical clustering with discrete latent variable models and the integrated classification likelihood
    Etienne Côme
    Nicolas Jouvin
    Pierre Latouche
    Charles Bouveyron
    Advances in Data Analysis and Classification, 2021, 15 : 957 - 986
  • [23] Hierarchical clustering with discrete latent variable models and the integrated classification likelihood
    Come, Etienne
    Jouvin, Nicolas
    Latouche, Pierre
    Bouveyron, Charles
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2021, 15 (04) : 957 - 986
  • [24] Etiologic inertia: Using latent variable models to address risk clustering
    Glass, T. A.
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2008, 167 (11) : S43 - S43
  • [25] Pairwise clustering and graphical models
    Shental, N
    Zomet, A
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 185 - 192
  • [26] Combining Graphical and Algebraic Approaches for Parameter Identification in Latent Variable Structural Equation Models
    Ankan, Ankur
    Wortel, Inge
    Bollen, Kenneth A.
    Textor, Johannes
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [27] Efficient inference for sparse latent variable models of transcriptional regulation
    Dai, Zhenwen
    Iqbal, Mudassar
    Lawrence, Neil D.
    Rattray, Magnus
    BIOINFORMATICS, 2017, 33 (23) : 3776 - 3783
  • [28] Graph learning for latent-variable Gaussian graphical models under laplacian constraints
    Li, Ran
    Lin, Jiming
    Qiu, Hongbing
    Zhang, Wenhui
    Wang, Junyi
    NEUROCOMPUTING, 2023, 532 : 67 - 76
  • [29] Learning Latent Tree Graphical Models
    Choi, Myung Jin
    Tan, Vincent Y. F.
    Anandkumar, Animashree
    Willsky, Alan S.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 1771 - 1812
  • [30] Latent variable models
    Bishop, CM
    LEARNING IN GRAPHICAL MODELS, 1998, 89 : 371 - 403