Differential prioritization in feature selection and classifier aggregation for multiclass microarray datasets

被引:15
|
作者
Ooi, Chia Huey [1 ]
Chetty, Madhu [1 ]
Teng, Shyh Wei [1 ]
机构
[1] Monash Univ, Gippsland Sch Informat Technol, Churchill, Vic, Australia
关键词
tissue classification; microarray data analysis; multiclass classification; feature selection; classifier aggregation;
D O I
10.1007/s10618-006-0055-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The high dimensionality of microarray datasets endows the task of multiclass tissue classification with various difficulties-the main challenge being the selection of features deemed relevant and non-redundant to form the predictor set for classifier training. The necessity of varying the emphases on relevance and redundancy, through the use of the degree of differential prioritization (DDP) during the search for the predictor set is also of no small importance. Furthermore, there are several types of decomposition technique for the feature selection (FS) problem-all-classes-at-once, one-vs.-all (OVA) or pairwise (PW). Also, in multiclass problems, there is the need to consider the type of classifier aggregation used-whether non-aggregated (a single machine), or aggregated (OVA or PW). From here, first we propose a systematic approach to combining the distinct problems of FS and classification. Then, using eight well-known multiclass microarray datasets, we empirically demonstrate the effectiveness of the DDP in various combinations of FS decomposition types and classifier aggregation methods. Aided by the variable DDP, feature selection leads to classification performance which is better than that of rank-based or equal-priorities scoring methods and accuracies higher than previously reported for benchmark datasets with large number of classes. Finally, based on several criteria, we make general recommendations on the optimal choice of the combination of FS decomposition type and classifier aggregation method for multiclass microarray datasets.
引用
收藏
页码:329 / 366
页数:38
相关论文
共 50 条
  • [21] An Experimental Comparison of Feature-Selection and Classification Methods for Microarray Datasets
    Cilia, Nicole Dalia
    De Stefano, Claudio
    Fontanella, Francesco
    Raimondo, Stefano
    di Freca, Alessandra Scotto
    INFORMATION, 2019, 10 (03)
  • [22] A GP Based Approach to the Classification of Multiclass Microarray Datasets
    Xu, Chun-Gui
    Liu, Kun-Hong
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2008, 5227 : 340 - 346
  • [23] Methodology article - Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data
    Ooi, Chia Huey
    Chetty, Madhu
    Teng, Shyh Wei
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [24] MULTICLASS BAYESIAN FEATURE SELECTION
    Foroughi, Ali
    Dalton, Lori A.
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 725 - 729
  • [25] Stable Feature Selection using Improved Whale Optimization Algorithm for Microarray Datasets
    Theng, Dipti
    Bhoyar, Kishor K.
    ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2023, 12 (01):
  • [26] A graph partitioning-based hybrid feature selection method in microarray datasets
    Oubaouzine, Abdelali
    Ouaderhman, Tayeb
    Chamlal, Hasna
    Knowledge and Information Systems, 2025, 67 (01) : 633 - 660
  • [27] Supervised feature selection on gene expression microarray datasets using manifold learning
    Zare, Masoumeh
    Azizizadeh, Najmeh
    Kazemipour, Ali
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2023, 237
  • [28] Feature selection techniques for microarray datasets: a comprehensive review, taxonomy, and future directions
    Balakrishnan, Kulanthaivel
    Dhanalakshmi, Ramasamy
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (10) : 1451 - 1478
  • [29] A model-based relevance estimation approach for feature selection in microarray datasets
    Bontempi, Gianluca
    Meyer, Patrick E.
    ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT II, 2008, 5164 : 21 - 31
  • [30] CCFS: A cooperating coevolution technique for large scale feature selection on microarray datasets
    Ebrahimpour, Mohammad K.
    Nezamabadi-Pour, Hossein
    Eftekhari, Mandi
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2018, 73 : 171 - 178