Using random forest for reliable classification and cost-sensitive learning for medical diagnosis

被引:59
|
作者
Yang, Fan [1 ]
Wang, Hua-zhen [1 ]
Mi, Hong [1 ]
Lin, Cheng-de [1 ]
Cai, Wei-wen [2 ]
机构
[1] Xiamen Univ, Automat Dept, Xiamen 361005, Peoples R China
[2] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
PREDICTION;
D O I
10.1186/1471-2105-10-S1-S22
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. Results: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. Conclusion: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Using random forest for reliable classification and cost-sensitive learning for medical diagnosis
    Fan Yang
    Hua-zhen Wang
    Hong Mi
    Cheng-de Lin
    Wei-wen Cai
    [J]. BMC Bioinformatics, 10
  • [2] Cost-sensitive fuzzy classification for medical diagnosis
    Schaefer, G.
    Nakashima, T.
    Yokota, Y.
    Ishibuchi, H.
    [J]. 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2007, : 312 - +
  • [3] Cost-sensitive classification based on Bregman divergences for medical diagnosis
    Santos-Rodriguez, Raul
    Garcia-Garcia, Dario
    Cid-Sueiro, Jesus
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 551 - 556
  • [4] Active Learning for Cost-Sensitive Classification
    Krishnamurthy, Akshay
    Agarwal, Alekh
    Huang, Tzu-Kuo
    Daume, Hal, III
    Langford, John
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [5] Active Learning for Cost-Sensitive Classification
    Krishnamurthy, Akshay
    Agarwal, Alekh
    Huang, Tzu-Kuo
    Daume, Hal, III
    Langford, John
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [6] Active learning for cost-sensitive classification
    Krishnamurthy, Akshay
    Agarwal, Alekh
    Huang, Tzu-Kuo
    Daumé Iii, Hal
    Langford, John
    [J]. Journal of Machine Learning Research, 2019, 20
  • [7] Diagnosis of Type 2 Diabetes Using Cost-Sensitive Learning
    Zahirnia, Kiarash
    Teimouri, Mehdi
    Rahmani, Rohallah
    Salaq, Amin
    [J]. 2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2015, : 158 - 163
  • [8] Cost-Sensitive Broad Learning System for Imbalanced Classification and Its Medical Application
    Yao, Liang
    Wong, Pak Kin
    Zhao, Baoliang
    Wang, Ziwen
    Lei, Long
    Wang, Xiaozheng
    Hu, Ying
    [J]. MATHEMATICS, 2022, 10 (05)
  • [9] A hybrid cost-sensitive machine learning approach for the classification of intelligent disease diagnosis
    Chen, Xi
    Jin, Wenquan
    Wu, Qirui
    Zhang, Wenbo
    Liang, Haiming
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 3039 - 3050
  • [10] Improving Imbalanced Dialogue Act Classification Using Cost-Sensitive Learning
    Miyagi, Takaaki
    Endo, Satoshi
    [J]. 2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,