Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval

被引:44
|
作者
Carpineto, Claudio [1 ]
Romano, Giovanni [1 ]
机构
[1] Fdn Ugo Bordoni, I-00161 Rome, Italy
关键词
Consensus clustering; Rand index; probabilistic Rand index; search results clustering; subtopic retrieval;
D O I
10.1109/TPAMI.2012.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
引用
收藏
页码:2315 / 2326
页数:12
相关论文
共 50 条
  • [21] A New Fuzzy Clustering Validity Index With a Median Factor for Centroid-Based Clustering
    Wu, Chih-Hung
    Ouyang, Chen-Sen
    Chen, Li-Wen
    Lu, Li-Wei
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2015, 23 (03) : 701 - 718
  • [22] New Internal Clustering Evaluation Index Based on Line Segments
    Rojas Thomas, Juan Carlos
    Santos Penas, Matilde
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2019, PT I, 2019, 11871 : 534 - 541
  • [23] A new internal index based on density core for clustering validation
    Xie, Jiang
    Xiong, Zhong-Yang
    Dai, Qi-Zhu
    Wang, Xiao-Xia
    Zhang, Yu-Fang
    INFORMATION SCIENCES, 2020, 506 : 346 - 365
  • [24] Audio retrieval with fast relevance feedback based on constrained fuzzy clustering and stored index table
    Zhao, XY
    Zhuang, YT
    Liu, JW
    Wu, F
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 237 - 244
  • [25] New approach for the probabilistic power flow of distribution systems based on data clustering
    Sehsalar, Omid Zare
    Galvani, Sadjad
    Farsadi, Mortaza
    IET RENEWABLE POWER GENERATION, 2019, 13 (14) : 2531 - 2540
  • [26] A New Consensus Function based on Dual-Similarity Measurements for Clustering Ensemble
    Alqurashi, Tahani
    Wang, Wenjia
    PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (IEEE DSAA 2015), 2015, : 152 - 160
  • [27] New Graph based Sequence Clustering Approach for News Article Retrieval System
    Nagalavi, Deepa
    Hanumanthappa, M.
    2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 1479 - 1482
  • [28] Curvelets based primitives for handwriting images analysis: application to document images retrieval and clustering
    Joutel, Guillaume
    Eglin, Veronique
    Emptoz, Hubert
    WAVELET APPLICATIONS IN INDUSTRIAL PROCESSING VI, 2009, 7248
  • [29] "Clickable Real World" Information Retrieval Application based on Geo-Visual Clustering
    Ito, Takashi
    Shimada, Atsushi
    Nagahara, Hajime
    Taniguchi, Rin-ichiro
    PROCEEDINGS OF THE 19TH KOREA-JAPAN JOINT WORKSHOP ON FRONTIERS OF COMPUTER VISION (FCV 2013), 2013, : 22 - 25
  • [30] A new subsequence similarity retrieval method based on inverted index in EAST
    Wang, Hao
    Yuan, Qiping
    Hu, Wenhui
    Xiao, Bingjia
    Ji, Zhenshan
    Zhang, Ruirui
    Zhang, Shuguang
    FUSION ENGINEERING AND DESIGN, 2022, 182