A kernel semi-supervised distance metric learning with relative distance: Integration with a MOO approach

被引:10
|
作者
Sanodiya, Rakesh Kumar [1 ]
Saha, Sriparna [1 ]
Mathew, Jimson [1 ]
机构
[1] Indian Inst Technol Patna, Dept Comp Sci & Engn, Patna 801103, Bihar, India
关键词
Semi supervised classification; Multi objective optimization; Bregman projection; Clustering; Metric learning; FEATURE-SELECTION; GENETIC ALGORITHM;
D O I
10.1016/j.eswa.2018.12.051
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Metric learning, which aims to determine an appropriate distance function to measure the similarity and dissimilarity between data points accurately, is one of the most popular methods to enhance the performance of many machine learning methods such as K-means clustering and K nearest neighbor classifier algorithms. These algorithms may not perform well because of the use of normal Euclidean distance function that ignores any statistical regularities that might be estimated from a large training set of labeled examples. In many real-world applications, the Euclidean distance may not be fit to capture the intrinsic similarity and dissimilarity between the data points. Compared to existing metric learning algorithms, which use large amount of labeled data in the form of must-link (ML) and cannot-link constraints as side information where the granularity of true clustering is unknown, our proposed approach uses few labeled data in the form of relative-distance constraints such as equality constraints, C-eq, and inequality constraints, C-neq. For satisfying such constraints, we need to project the initial Euclidean distance matrix by using Bregman projection on the convex subset of constraints in such a way that all the constraints are satisfied. Since Bregman projection is not orthogonal, means while satisfying the current constraint previously satisfied constraints may get unsatisfied, we need to select a proper subset of constraints for learning better distance function. The multi-objective framework is utilized for selecting a good subset of constraints which can help in getting the proper labeling of the data set. The selected subset of constraints is used for adjusting the initial kernel-matrix. K-means clustering technique is applied to the adjusted kernel matrix to label the data set. In order to evaluate the quality of obtained labeling, different external and internal cluster validity indices are deployed. The values of these indices are simultaneously optimized using the search capability of MOO with the aim of selecting the appropriate subset of constraints. The proposed approach is evaluated on UCI Human Activity Recognition using Smartphone Dataset v1.0 along with nine other popular data sets. Results show that our approach outperforms the state of the art semi-supervised metric learning algorithms with respect to different internal and external cluster validity indices. (C) 2018 Published by Elsevier Ltd.
引用
收藏
页码:233 / 248
页数:16
相关论文
共 50 条
  • [21] A semi-supervised multiview spectral clustering algorithm based on distance metric learning
    Yang J.
    Deng T.
    Sichuan Daxue Xuebao (Gongcheng Kexue Ban)/Journal of Sichuan University (Engineering Science Edition), 2016, 48 (01): : 146 - 151
  • [22] Semi-supervised metric learning by kernel matrix adaptation
    Chang, H
    Yeung, DY
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 3210 - 3215
  • [23] Semi-supervised Metric Learning Using Composite Kernel
    Zare, T.
    Sadeghi, M. T.
    Abutalebi, H. R.
    2012 SIXTH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2012, : 1151 - 1156
  • [24] Multi-objective Approach for Semi-Supervised Discriminant Analysis with Relative Distance
    Sanodiya, Rakesh Kumar
    Saha, Sriparna
    Mathew, Jimson
    Thalakottur, Michelle Davies
    Aadya, Utkarshinee
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 2808 - 2815
  • [25] Semi-supervised hybrid clustering by integrating Gaussian mixture model and distance metric learning
    Zhang, Yihao
    Wen, Junhao
    Wang, Xibin
    Jiang, Zhuo
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2015, 45 (01) : 113 - 130
  • [26] Method of semi-supervised distance metric learning for 3D model retrieval
    Wang, Xinying
    Wang, Shengsheng
    Lv, Tianyang
    Wang, Zhengxuan
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2010, 31 (06): : 1399 - 1404
  • [27] Investigating Distance Metric Learning in Semi-supervised Fuzzy c-means Clustering
    Lai, Daphne Teck Ching
    Garibaldi, Jonathan M.
    Reps, Jenna
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 1817 - 1824
  • [28] Semi-supervised hybrid clustering by integrating Gaussian mixture model and distance metric learning
    Yihao Zhang
    Junhao Wen
    Xibin Wang
    Zhuo Jiang
    Journal of Intelligent Information Systems, 2015, 45 : 113 - 130
  • [29] Semi-supervised distance metric learning based on local linear regression for data clustering
    Zhang, Hong
    Yu, Jun
    Wang, Meng
    Liu, Yun
    NEUROCOMPUTING, 2012, 93 : 100 - 105
  • [30] Integrating distance metric learning and cluster-level constraints in semi-supervised clustering
    Nogueira, Bruno Magalhaes
    Benevides Tomas, Yuri Karan
    Marcacini, Ricardo Marcondes
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 4118 - 4125