Estimating classification probabilities in high-dimensional diagnostic studies

被引:6
|
作者
Appel, Inka J. [1 ]
Gronwald, Wolfram [1 ]
Spang, Rainer [1 ]
机构
[1] Univ Regensburg, Inst Funct Genom, D-93053 Regensburg, Germany
关键词
GENE; CANCER; DISEASE;
D O I
10.1093/bioinformatics/btr434
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Classification algorithms for high-dimensional biological data like gene expression profiles or metabolomic fingerprints are typically evaluated by the number of misclassifications across a test dataset. However, to judge the classification of a single case in the context of clinical diagnosis, we need to assess the uncertainties associated with that individual case rather than the average accuracy across many cases. Reliability of individual classifications can be expressed in terms of class probabilities. While classification algorithms are a well-developed area of research, the estimation of class probabilities is considerably less progressed in biology, with only a few classification algorithms that provide estimated class probabilities. Results: We compared several probability estimators in the context of classification of metabolomics profiles. Evaluation criteria included sparseness biases, calibration of the estimator, the variance of the estimator and its performance in identifying highly reliable classifications. We observed that several of them display artifacts that compromise their use in practice. Classification probabilities based on a combination of local cross-validation error rates and monotone regression prove superior in metabolomic profiling.
引用
收藏
页码:2563 / 2570
页数:8
相关论文
共 50 条
  • [31] A classification method for high-dimensional imbalanced multi-classification data
    Li, Mengmeng
    Zheng, Qibin
    Liu, Yi
    Li, Gengsong
    Qin, Wei
    Ren, Xiaoguang
    ELECTRONICS LETTERS, 2023, 59 (20)
  • [32] Convex Reduction of High-Dimensional Kernels for Visual Classification
    Gavves, Efstratios
    Snoek, Cees G. M.
    Smeulders, Arnold W. M.
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 3610 - 3617
  • [33] Clonal Selection Classification Algorithm for High-Dimensional Data
    Liu, Ruochen
    Zhang, Ping
    Jiao, Licheng
    LIFE SYSTEM MODELING AND INTELLIGENT COMPUTING, PT II, 2010, 98 : 89 - 95
  • [34] Exploration of high-dimensional data manifolds for object classification
    Shah, N
    Waagen, D
    Ordaz, M
    Cassabaum, M
    Coit, A
    AUTOMATIC TARGET RECOGNITON XV, 2005, 5807 : 400 - 408
  • [35] INNOVATED INTERACTION SCREENING FOR HIGH-DIMENSIONAL NONLINEAR CLASSIFICATION
    Fan, Yingying
    Kong, Yinfei
    Li, Daoji
    Zheng, Zemin
    ANNALS OF STATISTICS, 2015, 43 (03): : 1243 - 1272
  • [36] Sure feature screening for high-dimensional dichotomous classification
    Li Shao
    Yuan Yu
    Yong Zhou
    Science China Mathematics, 2016, 59 : 2527 - 2542
  • [37] FLexible high-dimensional classification machines and their asymptotic properties
    Qiao, Xingye
    Zhang, Lingsong
    Journal of Machine Learning Research, 2015, 16 : 1547 - 1572
  • [38] Sure feature screening for high-dimensional dichotomous classification
    SHAO Li
    YU Yuan
    ZHOU Yong
    Science China Mathematics, 2016, 59 (12) : 2527 - 2542
  • [39] Hybrid Classification of High-Dimensional Biomedical Tumour Datasets
    Byczkowska-Lipinska, Liliana
    Wosiak, Agnieszka
    ADVANCED AND INTELLIGENT COMPUTATIONS IN DIAGNOSIS AND CONTROL, 2016, 386 : 287 - 298
  • [40] Representation and classification of high-dimensional biomedical spectral data
    Pedrycz, W.
    Lee, D. J.
    Pizzi, N. J.
    PATTERN ANALYSIS AND APPLICATIONS, 2010, 13 (04) : 423 - 436