A biobjective feature selection algorithm for large omics datasets

被引:1
|
作者
Cavique, Luis [1 ,2 ]
Mendes, Armando B. [3 ,4 ]
Martiniano, Hugo F. M. C. [1 ,5 ]
Correia, Luis [1 ]
机构
[1] FCUL, MAS BioISI, Lisbon, Portugal
[2] Univ Aberta, Lisbon, Portugal
[3] Univ Acores, Ponta Delgada, Portugal
[4] Univ Minho, Algoritmi, Braga, Portugal
[5] Inst Dr Ricardo Jorge, Lisbon, Portugal
关键词
biobjective optimization; feature selection; heuristic decomposition; logical analysis of data;
D O I
10.1111/exsy.12301
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency-based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the biobjective version of the algorithm logical analysis of inconsistent data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross-validation technique. The biobjective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome-like characteristics of patients with rare diseases.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] LAGOA: Learning automata based grasshopper optimization algorithm for feature selection in disease datasets
    Dey, Chiradeep
    Bose, Rajarshi
    Ghosh, Kushal Kanti
    Malakar, Samir
    Sarkar, Ram
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 13 (6) : 3175 - 3194
  • [22] A Nested Genetic Algorithm for feature selection in high-dimensional cancer Microarray datasets
    Sayed, Sabah
    Nassef, Mohammad
    Badr, Amr
    Farag, Ibrahim
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 121 : 233 - 243
  • [23] LAGOA: Learning automata based grasshopper optimization algorithm for feature selection in disease datasets
    Chiradeep Dey
    Rajarshi Bose
    Kushal Kanti Ghosh
    Samir Malakar
    Ram Sarkar
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 3175 - 3194
  • [24] Scalable Global Mutual Information Based Feature Selection Framework for Large Scale Datasets
    Soheili, Majid
    Haeri, Maryam Amir
    [J]. 2021 IEEE 25TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE (EDOC 2021), 2021, : 41 - 50
  • [25] Optimized Parameter Search for Large Datasets of the Regularization Parameter and Feature Selection for Ridge Regression
    Pieter Buteneers
    Ken Caluwaerts
    Joni Dambre
    David Verstraeten
    Benjamin Schrauwen
    [J]. Neural Processing Letters, 2013, 38 : 403 - 416
  • [26] Optimized Parameter Search for Large Datasets of the Regularization Parameter and Feature Selection for Ridge Regression
    Buteneers, Pieter
    Caluwaerts, Ken
    Dambre, Joni
    Verstraeten, David
    Schrauwen, Benjamin
    [J]. NEURAL PROCESSING LETTERS, 2013, 38 (03) : 403 - 416
  • [27] A Hybrid Feature Selection Algorithm Based on Large Neighborhood Search
    Taghizadeh, Gelareh
    Musliu, Nysret
    [J]. EVOLUTIONARY COMPUTATION IN COMBINATORIAL OPTIMIZATION (EVOCOP 2017), 2017, 10197 : 30 - 43
  • [28] OmicPredict: a framework for omics data prediction using ANOVA-Firefly algorithm for feature selection
    Kaur, Parampreet
    Singh, Ashima
    Chana, Inderveer
    [J]. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2023,
  • [29] FEATURE SELECTION FOR DATASETS WITH IMBALANCED CLASS DISTRIBUTIONS
    Kamal, Abu H. M.
    Zhu, Xingquan
    Pandya, Abhijit
    Hsu, Sam
    Narayanan, Ramaswamy
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2010, 20 (02) : 113 - 137
  • [30] Multitask feature selection within structural datasets
    Bee, Sarah
    Poole, Jack
    Worden, Keith
    Dervilis, Nikolaos
    Bull, Lawrence
    [J]. DATA-CENTRIC ENGINEERING, 2024, 5