Dataset complexity in gene expression based cancer classification using ensembles of k-nearest neighbors

被引:23
|
作者
Okun, Oleg [1 ]
Priisalu, Helen [2 ]
机构
[1] Univ Oulu, Elect & Informat Engn Dept, Oulu 90014, Finland
[2] Tallinn Univ Technol, Inst Cybernet, EE-12618 Tallinn, Estonia
关键词
Pattern recognition; Gene expression; Cancer classification; k-nearest neighbors; Ensemble of classifiers; FEATURE-SELECTION; MICROARRAY DATA; DNA; PREDICTION; CLASSIFIERS; TUMOR;
D O I
10.1016/j.artmed.2008.08.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: We explore the link between dataset complexity, determining how difficult a dataset is for classification, and classification performance defined by low-variance and tow-biased bolstered resubstitution error made by k-nearest neighbor classifiers. Methods and material: Gene expression based cancer classification is used as the task in this study. Six gene expression datasets containing different types of cancer constitute test data. Results: Through extensive simulation coupled with the copula method for analysis of association in bivariate data, we show that dataset complexity and bolstered resubstitution error are associated in terms of dependence. As a result, we propose a new scheme for generating ensembles of classifiers that selects subsets of features of low complexity for ensemble members, which constitutes the accurate members according to the found dependence relation. Conclusion: Experiments with six gene expression datasets demonstrate that our ensemble generating scheme based on the dependence of dataset complexity and classification error is superior to a-single best classifier in the ensemble and to the traditional ensemble construction scheme that is ignorant of dataset complexity. (c) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:151 / 162
页数:12
相关论文
共 50 条
  • [21] Quantum K-nearest neighbors classification algorithm based on Mahalanobis distance
    Gao, Li-Zhen
    Lu, Chun-Yue
    Guo, Gong-De
    Zhang, Xin
    Lin, Song
    FRONTIERS IN PHYSICS, 2022, 10
  • [22] PERFORMANCE OF K-NEAREST NEIGHBORS ALGORITHM IN OPINION CLASSIFICATION
    Jedrzejewski, Krzysztof
    Zamorski, Maurycy
    FOUNDATIONS OF COMPUTING AND DECISION SCIENCES, 2013, 38 (02) : 97 - 110
  • [23] Diminishing Prototype Size for k-Nearest Neighbors Classification
    Samadpour, Mohammad Mehdi
    Parvin, Hamid
    Rad, Farhad
    2015 FOURTEENTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (MICAI), 2015, : 139 - 144
  • [24] k-Nearest Neighbors for automated classification of celestial objects
    LI LiLi1
    2 Department of Physics
    3 Weishanlu Middle School
    Science China(Physics,Mechanics & Astronomy), 2008, (07) : 916 - 922
  • [25] k-Nearest Neighbors for automated classification of celestial objects
    Li LiLi
    Zhang YanXia
    Zhao YongHeng
    SCIENCE IN CHINA SERIES G-PHYSICS MECHANICS & ASTRONOMY, 2008, 51 (07): : 916 - 922
  • [26] k-nearest neighbors prediction and classification for spatial data
    Mohamed-Salem Ahmed
    Mamadou N’diaye
    Mohammed Kadi Attouch
    Sophie Dabo-Niange
    Journal of Spatial Econometrics, 2023, 4 (1):
  • [27] RSSI-based Localization Using K-Nearest Neighbors
    Achroufene, Achour
    AD HOC & SENSOR WIRELESS NETWORKS, 2023, 56 (1-2) : 105 - 135
  • [28] Forecasting Earnings Using k-Nearest Neighbors
    Easton, Peter D.
    Kapons, Martin M.
    Monahan, Steven J.
    Schutt, Harm H.
    Weisbrod, Eric H.
    ACCOUNTING REVIEW, 2024, 99 (03): : 115 - 140
  • [29] An adaptive k-nearest neighbors clustering algorithm for complex distribution dataset
    Zhang, Yan
    Jia, Yan
    Huang, Xiaobin
    Zhou, Bin
    Gu, Jian
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2007, 4682 : 398 - 407
  • [30] Emotion recognition using speckle pattern analysis and k-nearest neighbors classification
    Lupa Yitzhak, Hadas
    Tzabari Kelman, Yarden
    Moskovenko, Alexey
    Zhovnerchuk, Evgenii
    Zalevsky, Zeev
    JOURNAL OF OPTICS, 2021, 23 (01)