Hierarchical Overlapping Clustering of Network Data Using Cut Metrics

被引:1
|
作者
Gama, Fernando [1 ]
Segarra, Santiago [2 ]
Ribeiro, Alejandro [1 ]
机构
[1] Univ Penn, Dept Elect & Syst Engn, Philadelphia, PA 19104 USA
[2] MIT, Inst Data Syst & Soc, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
Clustering; cut metrics; covering; dithering; hierarchical clustering; network theory;
D O I
10.1109/TSIPN.2017.2707662
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A novel method to obtain hierarchical and overlapping clusters from network data-i.e., a set of nodes endowed with pairwise dissimilarities-is presented. The introduced method is hierarchical in the sense that it outputs a nested collection of groupings of the node set depending on the resolution or degree of similarity desired, and it is overlapping since it allows nodes to belong to more than one group. Our construction is rooted on the facts that a hierarchical (non-overlapping) clustering of a network can be equivalently represented by a finite ultrametric space and that a convex combination of ultrametrics results in a cut metric. By applying a hierarchical (non-overlapping) clustering method to multiple dithered versions of a given network, and then, convexly combining the resulting ultrametrics, we obtain a cut metric associated to the network of interest. We then show how to extract a hierarchical overlapping clustering structure from the aforementioned cutmetric. Furthermore, the so-called overlapping function is presented as a tool for gaining insights about the data by identifying meaningful resolutions of the obtained hierarchical structure. Additionally, we explore hierarchical overlapping quasi-clustering methods that preserve the asymmetry of the data contained in directed networks. Finally, the presented method is illustrated via synthetic and real-world classification problems including handwritten digit classification and authorship attribution of famous plays.
引用
收藏
页码:392 / 406
页数:15
相关论文
共 50 条
  • [1] OVERLAPPING CLUSTERING OF NETWORK DATA USING CUT METRICS
    Gama, Fernando
    Segarra, Santiago
    Ribeiro, Alejandro
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6415 - 6419
  • [2] Approximate Hierarchical Clustering via Sparsest Cut and Spreading Metrics
    Charikar, Moses
    Chatziafratis, Vaggos
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2017, : 841 - 854
  • [3] Overlapping clustering of gene expression data using penalized weighted normalized cut
    Hidalgo, Sebastian J. Teran
    Zhu, Tingyu
    Wu, Mengyun
    Ma, Shuangge
    [J]. GENETIC EPIDEMIOLOGY, 2018, 42 (08) : 796 - 811
  • [4] Hierarchical data clustering using aiNet immune network
    Liu, Li
    [J]. DCABES 2006 Proceedings, Vols 1 and 2, 2006, : 644 - 647
  • [5] Clustering Hierarchical Data Using SOM Neural Network
    Le Anh Tu
    Nguyen Quang Hoan
    Le Son Thai
    [J]. Context-Aware Systems and Applications, (ICCASA 2012), 2013, 109 : 282 - 289
  • [6] Overlapping Hierarchical Clustering (OHC)
    Jeantet, Ian
    Miklos, Zoltan
    Gross-Amblard, David
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XVIII, IDA 2020, 2020, 12080 : 261 - 273
  • [7] Detecting overlapping and hierarchical communities in complex network using interaction-based edge clustering
    Kim, Paul
    Kim, Sangwook
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2015, 417 : 46 - 56
  • [8] A new metrics for hierarchical clustering
    Yang, GW
    Shi, SM
    Wang, DX
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2003, 12 (04) : 494 - 498
  • [9] Hierarchical clustering for data mining by rbf network
    Ciftcioglu, Ö
    Sariyildiz, S
    [J]. DATA MINING II, 2000, 2 : 477 - 486
  • [10] Overlapping community detection algorithm based on fuzzy hierarchical clustering in social network
    School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an
    710049, China
    不详
    710049, China
    [J]. Hsi An Chiao Tung Ta Hsueh, 2 (6-13):