Supervised clustering of label ranking data using label preference information

被引:13
|
作者
Grbovic, Mihajlo [1 ]
Djuric, Nemanja [1 ]
Guo, Shengbo [2 ]
Vucetic, Slobodan [1 ]
机构
[1] Temple Univ, Dept Comp & Informat Sci, Ctr Data Analyt & Biomed Informat, Philadelphia, PA 19122 USA
[2] Xerox Res Ctr Europe, F-38240 Meylan, France
关键词
Label ranking; Supervised clustering; Preference learning; DESIGN; MODEL;
D O I
10.1007/s10994-013-5374-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies supervised clustering in the context of label ranking data. The goal is to partition the feature space into K clusters, such that they are compact in both the feature and label ranking space. This type of clustering has many potential applications. For example, in target marketing we might want to come up with K different offers or marketing strategies for our target audience. Thus, we aim at clustering the customers' feature space into K clusters by leveraging the revealed or stated, potentially incomplete customer preferences over products, such that the preferences of customers within one cluster are more similar to each other than to those of customers in other clusters. We establish several baseline algorithms and propose two principled algorithms for supervised clustering. In the first baseline, the clusters are created in an unsupervised manner, followed by assigning a representative label ranking to each cluster. In the second baseline, the label ranking space is clustered first, followed by partitioning the feature space based on the central rankings. In the third baseline, clustering is applied on a new feature space consisting of both features and label rankings, followed by mapping back to the original feature and ranking space. The RankTree principled approach is based on a Ranking Tree algorithm previously proposed for label ranking prediction. Our modification starts with K random label rankings and iteratively splits the feature space to minimize the ranking loss, followed by re-calculation of the K rankings based on cluster assignments. The MM-PL approach is a multi-prototype supervised clustering algorithm based on the Plackett-Luce (PL) probabilistic ranking model. It represents each cluster with a union of Voronoi cells that are defined by a set of prototypes, and assign each cluster with a set of PL label scores that determine the cluster central ranking. Cluster membership and ranking prediction for a new instance are determined by cluster membership of its nearest prototype. The unknown cluster PL parameters and prototype positions are learned by minimizing the ranking loss, based on two variants of the expectation-maximization algorithm. Evaluation of the proposed algorithms was conducted on synthetic and real-life label ranking data by considering several measures of cluster goodness: (1) cluster compactness in feature space, (2) cluster compactness in label ranking space and (3) label ranking prediction loss. Experimental results demonstrate that the proposed MM-PL and RankTree models are superior to the baseline models. Further, MM-PL is has shown to be much better than other algorithms at handling situations with significant fraction of missing label preferences.
引用
收藏
页码:191 / 225
页数:35
相关论文
共 50 条
  • [11] Multi-label Classification Systems by the Use of Supervised Clustering
    Rastin, Niloofar
    Jahromi, Mansoor Zolghadri
    Taheri, Mohammad
    2017 19TH CSI INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2017, : 246 - 249
  • [12] On Inferring Image Label Information Using Rank Minimization for Supervised Concept Embedding
    Bespalov, Dmitriy
    Dahl, Anders Lindbjerg
    Bai, Bing
    Shokoufandeh, Ali
    IMAGE ANALYSIS: 17TH SCANDINAVIAN CONFERENCE, SCIA 2011, 2011, 6688 : 103 - 113
  • [13] Label Ranking Forests
    de Sa, Claudio Rebelo
    Soares, Carlos
    Knobbe, Arno
    Cortez, Paulo
    EXPERT SYSTEMS, 2017, 34 (01)
  • [14] BLPSeg: Balance the Label Preference in Scribble-Supervised Semantic Segmentation
    Wang, Yude
    Zhang, Jie
    Kan, Meina
    Shan, Shiguang
    Chen, Xilin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4921 - 4934
  • [15] Label Distribution Learning by Maintaining Label Ranking Relation
    Jia, Xiuyi
    Shen, Xiaoxia
    Li, Weiwei
    Lu, Yunan
    Zhu, Jihua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 1695 - 1707
  • [16] Cautious label ranking with label-wise decomposition
    Destercke, Sebastien
    Masson, Marie-Helene
    Poss, Michael
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 246 (03) : 927 - 935
  • [17] Discernible visualization of high dimensional data using label information
    Kiyadeh, Asef Pourmasoumi Hasan
    Zamiri, Amin
    Yazdi, Hadi Sadohgi
    Ghaemi, Hadi
    APPLIED SOFT COMPUTING, 2015, 27 : 474 - 486
  • [18] Multi-Label Semi-Supervised Learning using Regularized Kernel Spectral Clustering
    Mehrkanoon, Siamak
    Suykens, Johan A. K.
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 4009 - 4016
  • [19] Preference rules for label ranking: Mining patterns in multi-target relations
    de Sa, Claudio Rebelo
    Azevedo, Paulo
    Soares, Carlos
    Jorge, Alipio Mario
    Knobbe, Arno
    INFORMATION FUSION, 2018, 40 : 112 - 125
  • [20] A semi-supervised label distribution learning model with label correlations and data manifold exploration
    Guo, Ruiqi
    Peng, Yong
    Kong, Wanzeng
    Li, Fan
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) : 10094 - 10108