Hierarchical Clustering via Spreading Metrics

被引:0
|
作者
Roy, Aurko [1 ]
Pokutta, Sebastian [2 ]
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, ISyE, Atlanta, GA 30332 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the cost function for hierarchical clusterings introduced by [16] where hierarchies are treated as first-class objects rather than deriving their cost from projections into flat clusters. It was also shown in [16] that a top-down algorithm returns a hierarchical clustering of cost at most O (alpha(n) log n) times the cost of the optimal hierarchical clustering, where alpha(n) is the approximation ratio of the Sparsest Cut subroutine used. Thus using the best known approximation algorithm for Sparsest Cut due to Arora-Rao-Vazirani, the top-down algorithm returns a hierarchical clustering of cost at most O (log(3/2) n) times the cost of the optimal solution. We improve this by giving an O(log n)-approximation algorithm for this problem. Our main technical ingredients are a combinatorial characterization of ultrametrics induced by this cost function, deriving an Integer Linear Programming (ILP) formulation for this family of ultrametrics, and showing how to iteratively round an LP relaxation of this formulation by using the idea of sphere growing which has been extensively used in the context of graph partitioning. We also prove that our algorithm returns an O(log n)-approximate hierarchical clustering for a generalization of this cost function also studied in [16]. We also give constant factor inapproximability results for this problem.
引用
收藏
页数:9
相关论文
共 50 条
  • [11] Divide-and-conquer approximation algorithms via spreading metrics
    Even, G
    Naor, J
    Rao, S
    Schieber, B
    JOURNAL OF THE ACM, 2000, 47 (04) : 585 - 616
  • [12] Speeding-up hierarchical agglomerative clustering in presence of expensive metrics
    Nanni, M
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 378 - 387
  • [13] Computation of failure probability via hierarchical clustering
    Yin, Chao
    Kareem, Ahsan
    STRUCTURAL SAFETY, 2016, 61 : 67 - 77
  • [14] Coordinated Robot Navigation via Hierarchical Clustering
    Arslan, Omur
    Guralnik, Dan P.
    Koditschek, Daniel E.
    IEEE TRANSACTIONS ON ROBOTICS, 2016, 32 (02) : 352 - 371
  • [15] Unsupervised hierarchical clustering via a genetic algorithm
    Greene, WA
    CEC: 2003 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-4, PROCEEDINGS, 2003, : 998 - 1005
  • [16] Hierarchical Clustering With Prototypes via Minimax Linkage
    Bien, Jacob
    Tibshirani, Robert
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (495) : 1075 - 1084
  • [17] Facial Occlusion Detection via Structural Error Metrics and Clustering
    Li, Xiao-Xin
    Liang, Ronghua
    Gao, Jiaquan
    Wang, Haixia
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: IMAGE AND VIDEO DATA ENGINEERING, ISCIDE 2015, PT I, 2015, 9242 : 118 - 127
  • [18] Automatic Clustering via Outward Statistical Testing on Density Metrics
    Wang, Guangtao
    Song, Qinbao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (08) : 1971 - 1985
  • [19] Simple primitive recognition via hierarchical face clustering
    Xiaolong Yang
    Xiaohong Jia
    Computational Visual Media, 2020, 6 (04) : 431 - 443
  • [20] Gaussian mixture learning via adaptive hierarchical clustering
    Li, Jichuan
    Nehorai, Arye
    SIGNAL PROCESSING, 2018, 150 : 116 - 121