Gravitational search algorithm and K-means for simultaneous feature selection and data clustering: a multi-objective approach

被引:0
|
作者
Jay Prakash
Pramod Kumar Singh
机构
[1] ABV - Indian Institute of Information technology and Management Gwalior,Computational Intelligence and Data Mining Research Laboratory
来源
Soft Computing | 2019年 / 23卷
关键词
Feature selection; Data clustering; Multi-objective optimization; Gravitational search algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Clustering is an unsupervised classification method used to group the objects of an unlabeled data set. The high dimensional data sets generally comprise of irrelevant and redundant features also along with the relevant features which deteriorate the clustering result. Therefore, feature selection is necessary to select a subset of relevant features as it improves discrimination ability of the original set of features which helps in improving the clustering result. Though many metaheuristics have been suggested to select subset of the relevant features in wrapper framework based on some criteria, most of them are marred by the three key issues. First, they require objects class information a priori which is unknown in unsupervised feature selection. Second, feature subset selection is devised on a single validity measure; hence, it produces a single best solution biased toward the cardinality of the feature subset. Third, they find difficulty in avoiding local optima owing to lack of balancing in exploration and exploitation in the feature search space. To deal with the first issue, we use unsupervised feature selection method where no class information is required. To address the second issue, we follow pareto-based approach to obtain diverse trade-off solutions by optimizing conceptually contradicting validity measures silhouette index (Sil) and feature cardinality (d). For the third issue, we introduce genetic crossover operator to improve diversity in a recent Newtonian law of gravity-based metaheuristic binary gravitational search algorithm (BGSA) in multi-objective optimization scenario; it is named as improved multi-objective BGSA for feature selection (IMBGSAFS). We use ten real-world data sets for comparison of the IMBGSAFS results with three multi-objective methods MBGSA, MOPSO, and NSGA-II in wrapper framework and the Pearson’s linear correlation coefficient (FM-CC) as a multi-objective filter method. We employ four multi-objective quality measures convergence, diversity, coverage and ONVG. The obtained results show superiority of the IMBGSAFS over its competitors. An external clustering validity index F-measure also establish the above finding. As the decision maker picks only a single solution from the set of trade-off solutions, we employee the F-measure to select a final single solution from the external archive. The quality of final solution achieved by IMBGSAFS is superior over competitors in terms of clustering accuracy and/or smaller subset size.
引用
收藏
页码:2083 / 2100
页数:17
相关论文
共 50 条
  • [1] Gravitational search algorithm and K-means for simultaneous feature selection and data clustering: a multi-objective approach
    Prakash, Jay
    Singh, Pramod Kumar
    [J]. SOFT COMPUTING, 2019, 23 (06) : 2083 - 2100
  • [2] Feature Selection using K-Means Genetic Algorithm for Multi-objective Optimization
    Anusha, M.
    Sathiaseelan, J. G. R.
    [J]. 3RD INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTING 2015 (ICRTC-2015), 2015, 57 : 1074 - 1080
  • [3] A Feature Selection Method Based on Multi-objective Optimisation with Gravitational Search Algorithm
    Dickson, Bolou Bolou
    Wang, Shengsheng
    Dong, Ruyi
    Wen, Changji
    [J]. GEO-INFORMATICS IN RESOURCE MANAGEMENT AND SUSTAINABLE ECOSYSTEM, 2016, 569 : 549 - 558
  • [4] Feature Selection Algorithm Based on K-means Clustering
    Tang, Xue
    Dong, Min
    Bi, Sheng
    Pei, Maofeng
    Cao, Dan
    Xie, Cheche
    Chi, Sunhuang
    [J]. 2017 IEEE 7TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2017, : 1522 - 1527
  • [5] Particle Swarm Optimization with K-means for Simultaneous Feature Selection and Data Clustering
    Prakash, Jay
    Singh, Pramod Kumar
    [J]. 2015 SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MACHINE INTELLIGENCE (ISCMI), 2015, : 74 - 78
  • [6] Automatic clustering and feature selection using multi-objective crow search algorithm
    Ranjan, Rajesh
    Chhabra, Jitender Kumar
    [J]. APPLIED SOFT COMPUTING, 2023, 142
  • [7] K-means Clustering with Feature Selection for Stream Data
    Wang, Xiao-dong
    Chen, Rung-Ching
    Yan, Fei
    Hendry
    [J]. 2018 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2018), 2018, : 453 - 456
  • [8] A new multi-objective differential evolution approach for simultaneous clustering and feature selection
    Hancer, Emrah
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 87
  • [9] A Cheap Feature Selection Approach for the K-Means Algorithm
    Capo, Marco
    Perez, Aritz
    Lozano, Jose A.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (05) : 2195 - 2208
  • [10] 9 Simultaneous Continuous Feature Selection and K Clustering by Multi Objective Genetic Algorithm
    Dutta, Dipankar
    Dutta, Paramartha
    Sil, Jaya
    [J]. PROCEEDINGS OF THE 2013 3RD IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2013, : 937 - 942