Ensemble of sparse classifiers for high-dimensional biological data

被引:7
|
作者
Kim, Sunghan [1 ,2 ]
Scalzo, Fabien [2 ]
Telesca, Donatello [3 ]
Hu, Xiao [2 ]
机构
[1] E Carolina Univ, Coll Technol & Comp Sci, Dept Engn, Greenville, NC 27858 USA
[2] Univ Calif Los Angeles, David Geffen Sch Med, Dept Neurosurg, Neural Syst & Dynam Lab, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Sch Publ Hlth, Dept Biostat, Los Angeles, CA 90095 USA
关键词
ensemble sparse classifier; i(0)-norm solution; feature selection; mass spectrometry; sparse solvers; OVARIAN-CANCER IDENTIFICATION; PROTEOMIC PATTERNS; SELECTION; SERUM; RECONSTRUCTION;
D O I
10.1504/IJDMB.2015.069416
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biological data are often high in dimension while the number of samples is small. In such cases, the performance of classification can be improved by reducing the dimension of data, which is referred to as feature selection. Recently, a novel feature selection method has been proposed utilising the sparsity of high-dimensional biological data where a small subset of features accounts for most variance of the dataset. In this study we propose a new classification method for high-dimensional biological data, which performs both feature selection and classification within a single framework. Our proposed method utilises a sparse linear solution technique and the bootstrap aggregating algorithm. We tested its performance on four public mass spectrometry cancer datasets along with two other conventional classification techniques such as Support Vector Machines and Adaptive Boosting. The results demonstrate that our proposed method performs more accurate classification across various cancer datasets than those conventional classification techniques.
引用
收藏
页码:167 / 183
页数:17
相关论文
共 50 条
  • [31] Ensemble Clustering for Boundary Detection in High-Dimensional Data
    Anagnostou, Panagiotis
    Pavlidis, Nicos G.
    Tasoulis, Sotiris
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2023, PT II, 2024, 14506 : 324 - 333
  • [32] Poster: Adversarial Examples for Classifiers in High-Dimensional Network Data
    Ahmed, Muhammad Ejaz
    Kim, Hyoungshick
    CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, : 2467 - 2469
  • [33] Efficient Sparse Representation for Learning With High-Dimensional Data
    Chen, Jie
    Yang, Shengxiang
    Wang, Zhu
    Mao, Hua
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4208 - 4222
  • [34] Ensemble Linear Subspace Analysis of High-Dimensional Data
    Ahmed, S. Ejaz
    Amiri, Saeid
    Doksum, Kjell
    ENTROPY, 2021, 23 (03)
  • [35] Subspace Clustering of Very Sparse High-Dimensional Data
    Peng, Hankui
    Pavlidis, Nicos
    Eckley, Idris
    Tsalamanis, Ioannis
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 3780 - 3783
  • [36] To aggregate or not to aggregate high-dimensional classifiers
    Xu, Cheng-Jian
    Hoefsloot, Huub C. J.
    Smilde, Age K.
    BMC BIOINFORMATICS, 2011, 12
  • [37] To aggregate or not to aggregate high-dimensional classifiers
    Cheng-Jian Xu
    Huub CJ Hoefsloot
    Age K Smilde
    BMC Bioinformatics, 12
  • [38] Categorical Data Analysis for High-Dimensional Sparse Gene Expression Data
    Dousti Mousavi, Niloufar
    Aldirawi, Hani
    Yang, Jie
    BIOTECH, 2023, 12 (03):
  • [39] High-dimensional sparse MANOVA
    Cai, T. Tony
    Xia, Yin
    JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 131 : 174 - 196
  • [40] On fuzzy feature selection in designing fuzzy classifiers for high-dimensional data
    Mansoori E.G.
    Shafiee K.S.
    Evol. Syst., 4 (255-265): : 255 - 265