Dataset complexity in gene expression based cancer classification using ensembles of k-nearest neighbors

被引:23
|
作者
Okun, Oleg [1 ]
Priisalu, Helen [2 ]
机构
[1] Univ Oulu, Elect & Informat Engn Dept, Oulu 90014, Finland
[2] Tallinn Univ Technol, Inst Cybernet, EE-12618 Tallinn, Estonia
关键词
Pattern recognition; Gene expression; Cancer classification; k-nearest neighbors; Ensemble of classifiers; FEATURE-SELECTION; MICROARRAY DATA; DNA; PREDICTION; CLASSIFIERS; TUMOR;
D O I
10.1016/j.artmed.2008.08.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: We explore the link between dataset complexity, determining how difficult a dataset is for classification, and classification performance defined by low-variance and tow-biased bolstered resubstitution error made by k-nearest neighbor classifiers. Methods and material: Gene expression based cancer classification is used as the task in this study. Six gene expression datasets containing different types of cancer constitute test data. Results: Through extensive simulation coupled with the copula method for analysis of association in bivariate data, we show that dataset complexity and bolstered resubstitution error are associated in terms of dependence. As a result, we propose a new scheme for generating ensembles of classifiers that selects subsets of features of low complexity for ensemble members, which constitutes the accurate members according to the found dependence relation. Conclusion: Experiments with six gene expression datasets demonstrate that our ensemble generating scheme based on the dependence of dataset complexity and classification error is superior to a-single best classifier in the ensemble and to the traditional ensemble construction scheme that is ignorant of dataset complexity. (c) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:151 / 162
页数:12
相关论文
共 50 条
  • [1] Dataset Complexity Can Help to Generate Accurate Ensembles of K-Nearest Neighbors
    Okun, Oleg
    Valentini, Giorgio
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 450 - 457
  • [2] Gene expression cancer classification using modified K-Nearest Neighbors technique
    Ayyad, Sarah M.
    Saleh, Ahmed, I
    Labib, Labib M.
    BIOSYSTEMS, 2019, 176 : 41 - 51
  • [3] Ensembles of K-Nearest Neighbors and Dimensionality Reduction
    Okun, Oleg
    Priisalu, Helen
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 2032 - +
  • [4] Classification with learning k-nearest neighbors
    Laaksonen, J
    Oja, E
    ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1480 - 1483
  • [5] Classification of Contaminated Insulators Using k-Nearest Neighbors Based on Computer Vision
    Corso, Marcelo Picolotto
    Perez, Fabio Luis
    Stefenon, Stefano Frizzo
    Yow, Kin-Choong
    Garcia Ovejero, Raul
    Quietinho Leithardt, Valderi Reis
    COMPUTERS, 2021, 10 (09)
  • [6] Classification using the local probabilistic centers of k-nearest neighbors
    Li, Bo Yu
    Chen, Yun Wen
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS, 2006, : 1220 - +
  • [7] Locally Adaptive Text Classification based k-nearest Neighbors
    Yu, Xiao-gao
    Yu, Xiao-peng
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5651 - +
  • [8] AutoML for Stream k-Nearest Neighbors Classification
    Bahri, Maroua
    Veloso, Bruno
    Bifet, Albert
    Gama, Joao
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 597 - 602
  • [9] Compressed k-Nearest Neighbors Ensembles for Evolving Data Streams
    Bahri, Maroua
    Bifet, Albert
    Maniu, Silviu
    de Mello, Rodrigo F.
    Tziortziotis, Nikolaos
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 961 - 968
  • [10] Feature Extraction, Selection, and K-Nearest Neighbors Algorithm for Shark Behavior Classification Based on Imbalanced Dataset
    Yang, Yu
    Yeh, Hen-Geul
    Zhang, Wenlu
    Lee, Calvin J.
    Meese, Emily N.
    Lowe, Christopher G.
    IEEE SENSORS JOURNAL, 2021, 21 (05) : 6429 - 6439