Conformalized Semi-supervised Random Forest for Classification and Abnormality Detection

被引:0
|
作者
Han, Yujin [1 ,4 ]
Xu, Mingwenchan [2 ,4 ]
Guan, Leying [3 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[2] Northwestern Univ, Dept IEMS, Evanston, IL USA
[3] Yale Univ, Dept Biostat, New Haven, CT 06520 USA
[4] Yale Univ, New Haven, CT USA
关键词
PREDICTIVE INFERENCE; COVARIATE SHIFT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Random Forests classifier, a widely utilized o.-the-shelf classification tool, assumes training and test samples come from the same distribution as other standard classifiers. However, in safety-critical scenarios like medical diagnosis and network attack detection, discrepancies between the training and test sets, including the potential presence of novel outlier samples not appearing during training, can pose significant challenges. To address this problem, we introduce the Conformalized Semi-Supervised Random Forest (CSForest), which couples the conformalization technique Jackknife+aB with semi-supervised tree ensembles to construct a set-valued prediction C(x). Instead of optimizing over the training distribution, CSForest employs unlabeled test samples to enhance accuracy and flag unseen outliers by generating an empty set. Theoretically, we establish CSForest to cover true labels for previously observed inlier classes under arbitrarily label-shift in the test data. We compare CSForest with state-of-the-art methods using synthetic examples and various real-world datasets, under different types of distribution changes in the test domain. Our results highlight CSForest's effective prediction of inliers and its ability to detect outlier samples unique to the test data. In addition, CSForest shows persistently good performance as the sizes of the training and test sets vary. Codes of CSForest are available at https://github.com/yujinhan98/CSForest
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Conformalized Semi-supervised Random Forest for Classification and Abnormality Detection
    Han, Yujin
    Xu, Mingwenchan
    Guan, Leying
    Proceedings of Machine Learning Research, 2024, 238 : 2881 - 2889
  • [2] Active Semi-Supervised Random Forest for Hyperspectral Image Classification
    Zhang, Youqiang
    Cao, Guo
    Li, Xuesong
    Wang, Bisheng
    Fu, Peng
    REMOTE SENSING, 2019, 11 (24)
  • [3] SEMI-SUPERVISED CLASSIFICATION OF HYPERSPECTRAL IMAGE USING RANDOM FOREST ALGORITHM
    Amini, S.
    Homayouni, S.
    Safari, A.
    2014 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2014,
  • [4] RANDOM FOREST IN SEMI-SUPERVISED LEARNING (CO-FOREST)
    Settouti, Nesma
    Daho, Mostafa El Habib
    Lazouni, Mohammed El Amine
    Chikh, Mohammed Amine
    2013 8TH INTERNATIONAL WORKSHOP ON SYSTEMS, SIGNAL PROCESSING AND THEIR APPLICATIONS (WOSSPA), 2013, : 326 - 329
  • [5] HSRF: Community Detection Based on Heterogeneous Attributes and Semi-Supervised Random Forest
    Fan, Zijing
    Yuan, Chao
    Xin, Liling
    Wang, Xuren
    Jiang, Zhengwei
    Wang, Qiuyun
    PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2021, : 1141 - 1147
  • [6] Semi-supervised Node Splitting for Random Forest Construction
    Liu, Xiao
    Song, Mingli
    Tao, Dacheng
    Liu, Zicheng
    Zhang, Luming
    Chen, Chun
    Bu, Jiajun
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 492 - 499
  • [7] Semi-Supervised Isolation Forest for Anomaly Detection
    Stradiotti, Luca
    Perini, Lorenzo
    Davis, Jesse
    PROCEEDINGS OF THE 2024 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2024, : 670 - 678
  • [8] Image classification: A random semi-supervised sampling approach
    Han, Dongfeng
    Zhu, Zhiliang
    Li, Wenhui
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2009, 21 (09): : 1333 - 1338
  • [9] Semi-supervised Learning Approach to Abnormality Detection with Complementary Features
    Lu, Shaowen
    Wen, Yixin
    2020 IEEE 18TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), VOL 1, 2020, : 110 - 114
  • [10] A Semi-supervised Generalized VAE Framework for Abnormality Detection using One-Class Classification
    Sharma, Renuka
    Mashkaria, Satvik
    Awate, Suyash P.
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1302 - 1310