Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval

被引:44
|
作者
Carpineto, Claudio [1 ]
Romano, Giovanni [1 ]
机构
[1] Fdn Ugo Bordoni, I-00161 Rome, Italy
关键词
Consensus clustering; Rand index; probabilistic Rand index; search results clustering; subtopic retrieval;
D O I
10.1109/TPAMI.2012.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
引用
收藏
页码:2315 / 2326
页数:12
相关论文
共 50 条
  • [31] Effective hierarchical cluster analysis based on new clustering validity index
    Zhu, Er-Zhou
    Ju, Yin-Yin
    Liu, Da-Wei
    Li, Yang
    Liu, Dong
    Zhu-Juan, Z.-J.
    Zhu-Juan, Z.-J., 1600, Codon Publications (31): : 119 - 133
  • [32] A New Cluster Validity Index for Stock Clustering Based on Efficient Frontier
    Lu, Yahui
    Li, Minghao
    Tang, Xiaochu
    Wang, Hui
    2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020), 2020, : 193 - 197
  • [33] A New Clustering Validity Index based on K-means Algorithm
    Hou, Xiangru
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [34] New cluster validity index for fuzzy clustering based on similarity measure
    Hossein, Mohammad
    Zarandi, Fazel
    Neshat, Elahe
    Tuerksen, I. Burhan
    ROUGH SETS, FUZZY SETS, DATA MINING AND GRANULAR COMPUTING, PROCEEDINGS, 2007, 4482 : 127 - +
  • [35] A New Fuzzy Clustering Validity Index Based on Fuzzy Proximity Matrices
    Valente, Rafael Xavier
    Braga, Antonio Padua
    Pedrycz, Witold
    2013 1ST BRICS COUNTRIES CONGRESS ON COMPUTATIONAL INTELLIGENCE AND 11TH BRAZILIAN CONGRESS ON COMPUTATIONAL INTELLIGENCE (BRICS-CCI & CBIC), 2013, : 489 - 494
  • [36] A novel index retrieval and query optimisation method for private information retrieval in location-based service application
    Kumar K.M.M.
    Bhat R.
    Sunitha N.R.
    International Journal of Intelligent Information and Database Systems, 2021, 14 (04) : 379 - 402
  • [37] A new approach for probabilistic harmonic load flow in distribution systems based on data clustering
    Galvani, Sadjad
    Marjani, Saeed Rezaeian
    Morsali, Javad
    Jirdehi, Mehdi Ahmadi
    ELECTRIC POWER SYSTEMS RESEARCH, 2019, 176
  • [38] A New Label Maximization Based Incremental Neural Clustering Approach: Application to Text Clustering
    Lamirel, Jean-Charles
    Mall, Raghvendra
    Al Shehabi, Shadi
    Safi, Ghada
    ADVANCES IN SELF-ORGANIZING MAPS, WSOM 2011, 2011, 6731 : 257 - 266
  • [39] A new secure data retrieval system based on ECDH and hierarchical clustering with Pearson correlation
    Swami, Rosy
    Das, Prodipto
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2025, 21 (01) : 195 - 205
  • [40] Topology: A Theory of a Pseudometric-Based Clustering Model and Its Application in Content-Based Image Retrieval
    Osuna-Galan, I.
    Perez-Pimentel, Y.
    Aviles-Cruz, Carlos
    Villegas-Cortez, Juan
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019