Clustering-based hybrid feature selection approach for high dimensional microarray data

被引:19
|
作者
Babu, Samson Anosh P. [1 ]
Annavarapu, Chandra Sekhara Rao [1 ]
Dara, Suresh [2 ]
机构
[1] IIT ISM Dhanbad, Dept Comp Sci & Engn, Jharkand, India
[2] B V Raju Inst Technol, Dept Comp Sci & Engn, Narsapur, Telangana, India
关键词
Microarray data; Clustering; Ant colony optimization; Classification; CELLULAR LEARNING AUTOMATA; GENE-EXPRESSION DATA; DISTRIBUTED FEATURE-SELECTION; CLASSIFICATION; ALGORITHM; OPTIMIZATION; DISCOVERY;
D O I
10.1016/j.chemolab.2021.104305
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The DNA microarrays are used to monitor the expression levels of significant genes. Most of the microarray data are assumed to be high dimensional, redundant, and noisy. This paper proposed a clustering-based hybrid gene selection approach to reduce the high dimensionality and increase the classification accuracy of cancer microarray data. The proposed approach uses the combined method of k-means clustering algorithm and signal-to-noise-ratio ranking method as a primary filtering method to reduce the high dimensionality of the microarray dataset. A cellular learning automaton combined with ant colony optimization is then applied on the reduced dataset as a wrapper method to get the optimized gene subset. The classifiers adopted to evaluate the proposed method are support vector machine, K-nearest neighbor, and Naive Bayes. The experiments showed promising results in gene subset selection and classification.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Clustering-based Sequential Feature Selection Approach for High Dimensional Data Classification
    Alimoussa, M.
    Porebski, A.
    Vandenbroucke, N.
    Thami, R. Oulad Haj
    El Fkihi, S.
    [J]. VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 4: VISAPP, 2021, : 122 - 132
  • [2] A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data
    Song, Qinbao
    Ni, Jingjie
    Wang, Guangtao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (01) : 1 - 14
  • [3] Implementation of FAST Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data
    Shilu, Smit
    Sheth, Kushal
    Mehul, Ekata
    [J]. PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT ICT4SD 2015, VOL 2, 2016, 409 : 203 - 213
  • [4] Graph clustering-based discretization approach to microarray data
    Kittakorn Sriwanna
    Tossapon Boongoen
    Natthakan Iam-On
    [J]. Knowledge and Information Systems, 2019, 60 : 879 - 906
  • [5] Graph clustering-based discretization approach to microarray data
    Sriwanna, Kittakorn
    Boongoen, Tossapon
    Iam-On, Natthakan
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (02) : 879 - 906
  • [6] Clustering-based feature selection
    School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510006, China
    [J]. Tien Tzu Hsueh Pao, 2008, SUPPL. (157-160):
  • [7] Hybrid feature selection approach based on GRASP for cancer microarray data
    Nagpal, Arpita
    Gaur, Deepti
    [J]. Journal of Computing and Information Technology, 2017, 25 (02) : 133 - 148
  • [8] A Novel Clustering-Based Hybrid Feature Selection Approach Using Ant Colony Optimization
    Rajesh Dwivedi
    Aruna Tiwari
    Neha Bharill
    Milind Ratnaparkhe
    [J]. Arabian Journal for Science and Engineering, 2023, 48 : 10727 - 10744
  • [9] A Novel Clustering-Based Hybrid Feature Selection Approach Using Ant Colony Optimization
    Dwivedi, Rajesh
    Tiwari, Aruna
    Bharill, Neha
    Ratnaparkhe, Milind
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (08) : 10727 - 10744
  • [10] Distance based feature selection for clustering microarray data
    Dash, Manoranjan
    Gopalkrishnan, Vivekanand
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2008, 4947 : 512 - 519