Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data

被引:0
|
作者
Feras Uzma
Abdallah Al-Obeidat
Babar Tubaishat
Zahid Shah
机构
[1] Ghulam Ishaq Khan Institute of Engineering Sciences and Technology,The Machine Intelligence Research Group (MInG), Faculty of Computer Science and Engineering
[2] College of Technological Innovation at Zayed University,undefined
来源
关键词
Deep learning; Gene expression; Clustering; Unsupervised learning; Genetic algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Cancer is a severe condition of uncontrolled cell division that results in a tumor formation that spreads to other tissues of the body. Therefore, the development of new medication and treatment methods for this is in demand. Classification of microarray data plays a vital role in handling such situations. The relevant gene selection is an important step for the classification of microarray data. This work presents gene encoder, an unsupervised two-stage feature selection technique for the cancer samples’ classification. The first stage aggregates three filter methods, namely principal component analysis, correlation, and spectral-based feature selection techniques. Next, the genetic algorithm is used, which evaluates the chromosome utilizing the autoencoder-based clustering. The resultant feature subset is used for the classification task. Three classifiers, namely support vector machine, k-nearest neighbors, and random forest, are used in this work to avoid the dependency on any one classifier. Six benchmark gene expression datasets are used for the performance evaluation, and a comparison is made with four state-of-the-art related algorithms. Three sets of experiments are carried out to evaluate the proposed method. These experiments are for the evaluation of the selected features based on sample-based clustering, adjusting optimal parameters, and for selecting better performing classifier. The comparison is based on accuracy, recall, false positive rate, precision, F-measure, and entropy. The obtained results suggest better performance of the current proposal.
引用
收藏
页码:8309 / 8331
页数:22
相关论文
共 50 条
  • [1] Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data
    Uzma
    Al-Obeidat, Feras
    Tubaishat, Abdallah
    Shah, Babar
    Halim, Zahid
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (11): : 8309 - 8331
  • [2] Unsupervised Feature Selection for Microarray Gene Expression Data Based on Discriminative Structure Learning
    Ye, Xiucai
    Sakurai, Tetsuya
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2018, 24 (06) : 725 - 741
  • [3] PSO Based Feature Selection for Clustering Gene Expression Data
    Deepthi, P. S.
    Thampi, Sabu M.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
  • [4] Feature selection and gene clustering from gene expression data
    Mitra, P
    Majumder, DD
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 343 - 346
  • [5] CGUFS: A clustering-guided unsupervised feature selection algorithm for gene expression data
    Xu, Zhaozhao
    Yang, Fangyuan
    Wang, Hong
    Sun, Junding
    Zhu, Hengde
    Wang, Shuihua
    Zhang, Yudong
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (09)
  • [6] Informative Feature Clustering and Selection for Gene Expression Data
    Yang, Yuqi
    Yin, Pengshuai
    Luo, Zhihang
    Gu, Wenwen
    Chen, Renjie
    Wu, Qingyao
    [J]. IEEE ACCESS, 2019, 7 : 169174 - 169184
  • [7] Unsupervised feature learning-based encoder and adversarial networks
    Endang Suryawati
    Hilman F. Pardede
    Vicky Zilvan
    Ade Ramdan
    Dikdik Krisnandi
    Ana Heryana
    R. Sandra Yuwana
    R. Budiarianto Suryo Kusumo
    Andria Arisal
    Ahmad Afif Supianto
    [J]. Journal of Big Data, 8
  • [8] Unsupervised feature learning-based encoder and adversarial networks
    Suryawati, Endang
    Pardede, Hilman F.
    Zilvan, Vicky
    Ramdan, Ade
    Krisnandi, Dikdik
    Heryana, Ana
    Yuwana, R. Sandra
    Kusumo, R. Budiarianto Suryo
    Arisal, Andria
    Supianto, Ahmad Afif
    [J]. JOURNAL OF BIG DATA, 2021, 8 (01)
  • [9] Simultaneous Feature Selection and Unsupervised Clustering for Gene-Expression Data in Multiobjective Optimization Framework
    Alok, Abhay Kumar
    Kanekar, Neha
    Saha, Sriparna
    Ekbal, Asif
    [J]. 2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 691 - 696
  • [10] An Efficient Feature Selection Technique for Gene Expression Data
    Chandra, B.
    [J]. 2018 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2018, : 132 - 137