Efficient, certifiably optimal clustering with applications to latent variable graphical models

被引:0
|
作者
Carson Eisenach
Han Liu
机构
[1] Princeton University,Department of Operations Research and Financial Engineering
[2] Northwestern University,Department of Electrical Engineering and Computer Science
来源
Mathematical Programming | 2019年 / 176卷
关键词
90C22; 90C35; 90C90; 62H30;
D O I
暂无
中图分类号
学科分类号
摘要
Motivated by the task of clustering either d variables or d points into K groups, we investigate efficient algorithms to solve the Peng–Wei (P–W) K-means semi-definite programming (SDP) relaxation. The P–W SDP has been shown in the literature to have good statistical properties in a variety of settings, but remains intractable to solve in practice. To this end we propose FORCE, a new algorithm to solve this SDP relaxation. Compared to off-the-shelf interior point solvers, our method reduces the computational complexity of solving the SDP from O~(d7logϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^7\log \epsilon ^{-1})$$\end{document} to O~(d6K-2ϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^{6}K^{-2}\epsilon ^{-1})$$\end{document} arithmetic operations for an ϵ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\epsilon $$\end{document}-optimal solution. Our method combines a primal first-order method with a dual optimality certificate search, which when successful, allows for early termination of the primal method. We show for certain variable clustering problems that, with high probability, FORCE is guaranteed to find the optimal solution to the SDP relaxation and provide a certificate of exact optimality. As verified by our numerical experiments, this allows FORCE to solve the P–W SDP with dimensions in the hundreds in only tens of seconds. For a variation of the P–W SDP where K is not known a priori a slight modification of FORCE reduces the computational complexity of solving this problem as well: from O~(d7logϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^7\log \epsilon ^{-1})$$\end{document} using a standard SDP solver to O~(d4ϵ-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widetilde{{\mathcal {O}}}}(d^{4}\epsilon ^{-1})$$\end{document}.
引用
收藏
页码:137 / 173
页数:36
相关论文
共 50 条
  • [31] Latent variable and latent structure models
    von Eye, A
    CONTEMPORARY PSYCHOLOGY-APA REVIEW OF BOOKS, 2004, 49 (02): : 204 - 204
  • [32] Latent variable and latent structure models
    Bunting, B
    BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2003, 56 : 184 - 185
  • [33] Directed Clustering of Multivariate Data Based on Linear or Quadratic Latent Variable Models
    Zhang, Yingjuan
    Einbeck, Jochen
    ALGORITHMS, 2024, 17 (08)
  • [34] Latent variable and clustering methods in intersectionality research: systematic review of methods applications
    Bauer, Greta R.
    Mahendran, Mayuri
    Walwyn, Chantel
    Shokoohi, Mostafa
    SOCIAL PSYCHIATRY AND PSYCHIATRIC EPIDEMIOLOGY, 2022, 57 (02) : 221 - 237
  • [35] Latent variable and clustering methods in intersectionality research: systematic review of methods applications
    Greta R. Bauer
    Mayuri Mahendran
    Chantel Walwyn
    Mostafa Shokoohi
    Social Psychiatry and Psychiatric Epidemiology, 2022, 57 : 221 - 237
  • [36] Dynamic exploration designs for graphical models using clustering with applications to petroleum exploration
    Martinelli, Gabriele
    Eidsvik, Jo
    KNOWLEDGE-BASED SYSTEMS, 2014, 58 : 113 - 126
  • [37] Sparse Plus Low-rank Identification of Latent-variable Graphical ARMA Models
    You, Junyao
    Yu, Chengpu
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 6217 - 6222
  • [38] Efficient preconditioned stochastic gradient descent for estimation in latent variable models
    Baey, Charlotte
    Delattre, Maud
    Kuhn, Estelle
    Leger, Jean-Benoist
    Lemler, Sarah
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [39] Latent variable models are network models
    Molenaar, Peter C. M.
    BEHAVIORAL AND BRAIN SCIENCES, 2010, 33 (2-3) : 166 - +
  • [40] Introduction to the special sectionNew issues in latent variable models: theory and applications
    Stefania Mignani
    Carla Rampichini
    Quality & Quantity, 2015, 49 : 879 - 880