Ensemble of sparse classifiers for high-dimensional biological data

被引:7
|
作者
Kim, Sunghan [1 ,2 ]
Scalzo, Fabien [2 ]
Telesca, Donatello [3 ]
Hu, Xiao [2 ]
机构
[1] E Carolina Univ, Coll Technol & Comp Sci, Dept Engn, Greenville, NC 27858 USA
[2] Univ Calif Los Angeles, David Geffen Sch Med, Dept Neurosurg, Neural Syst & Dynam Lab, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Sch Publ Hlth, Dept Biostat, Los Angeles, CA 90095 USA
关键词
ensemble sparse classifier; i(0)-norm solution; feature selection; mass spectrometry; sparse solvers; OVARIAN-CANCER IDENTIFICATION; PROTEOMIC PATTERNS; SELECTION; SERUM; RECONSTRUCTION;
D O I
10.1504/IJDMB.2015.069416
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biological data are often high in dimension while the number of samples is small. In such cases, the performance of classification can be improved by reducing the dimension of data, which is referred to as feature selection. Recently, a novel feature selection method has been proposed utilising the sparsity of high-dimensional biological data where a small subset of features accounts for most variance of the dataset. In this study we propose a new classification method for high-dimensional biological data, which performs both feature selection and classification within a single framework. Our proposed method utilises a sparse linear solution technique and the bootstrap aggregating algorithm. We tested its performance on four public mass spectrometry cancer datasets along with two other conventional classification techniques such as Support Vector Machines and Adaptive Boosting. The results demonstrate that our proposed method performs more accurate classification across various cancer datasets than those conventional classification techniques.
引用
下载
收藏
页码:167 / 183
页数:17
相关论文
共 50 条
  • [1] An ensemble of case-based classifiers for high-dimensional biological domains
    Arshadi, N
    Jurisica, I
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2005, 3620 : 21 - 34
  • [2] Sparse representation approaches for the classification of high-dimensional biological data
    Li, Yifeng
    Ngom, Alioune
    BMC SYSTEMS BIOLOGY, 2013, 7
  • [3] Probabilistic classifiers with high-dimensional data
    Kim, Kyung In
    Simon, Richard
    BIOSTATISTICS, 2011, 12 (03) : 399 - 412
  • [4] High-Dimensional Quadratic Classifiers in Non-sparse Settings
    Makoto Aoshima
    Kazuyoshi Yata
    Methodology and Computing in Applied Probability, 2019, 21 : 663 - 682
  • [5] High-Dimensional Quadratic Classifiers in Non-sparse Settings
    Aoshima, Makoto
    Yata, Kazuyoshi
    METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2019, 21 (03) : 663 - 682
  • [6] On the anonymization of sparse high-dimensional data
    Ghinita, Gabriel
    Tao, Yufei
    Kalnis, Panos
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 715 - +
  • [7] On rank distribution classifiers for high-dimensional data
    Samuel Makinde, Olusola
    JOURNAL OF APPLIED STATISTICS, 2020, 47 (13-15) : 2895 - 2911
  • [8] Interpolation of sparse high-dimensional data
    Lux, Thomas C. H.
    Watson, Layne T.
    Chang, Tyler H.
    Hong, Yili
    Cameron, Kirk
    NUMERICAL ALGORITHMS, 2021, 88 (01) : 281 - 313
  • [9] Interpolation of sparse high-dimensional data
    Thomas C. H. Lux
    Layne T. Watson
    Tyler H. Chang
    Yili Hong
    Kirk Cameron
    Numerical Algorithms, 2021, 88 : 281 - 313
  • [10] Geometric classifiers for high-dimensional noisy data
    Ishii, Aki
    Yata, Kazuyoshi
    Aoshima, Makoto
    JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 188