Bayesian weighted random forest for classification of high-dimensional genomics data

被引:7
|
作者
Olaniran, Oyebayo Ridwan [1 ]
Abdullah, Mohd Asrul A. [2 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
[2] UTHM, Dept Math & Stat, FAST, Parit Raja, Johor, Malaysia
关键词
Bayesian; High-dimensional; Genomic data; Classifcation; Random forest; VARIABLE SELECTION; BREAST-CANCER; GENE; PREDICTION; TUMOR; PATTERNS; LEUKEMIA;
D O I
10.1016/j.kjs.2023.06.008
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, a full Bayesian weighted probabilistic model is developed for random classification trees. The new model Bayesian Weighted Random Classification Forest (BWRCF) arises from the modification of the existing random classification forest in two ways. Firstly, the tree terminal node estimation procedure is replaced with a Bayesian estimation approach. Secondly, a new variable ranking procedure is developed and then hybridized with BWRCF to tackle the high-dimensionality issues. The performance of the proposed method is analyzed using simulated and real-life high-dimensional microarray datasets based on holdout accuracy and misclassification error rates. The results of the analyses showed that the proposed BWRCF is robust in terms of its ability to withstand moderate to large high-dimensionality scenarios. In addition, BWRCF also has improved predictive and efficiency abilities over selected competing methods.
引用
收藏
页码:477 / 484
页数:8
相关论文
共 50 条
  • [11] Weighted random subspace method for high dimensional data classification
    Li, Xiaoye
    Zhao, Hongyu
    STATISTICS AND ITS INTERFACE, 2009, 2 (02) : 153 - 159
  • [12] Multiple Bayesian discriminant functions for high-dimensional massive data classification
    Zhang, Jianfei
    Wang, Shengrui
    Chen, Lifei
    Gallinari, Patrick
    DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 31 (02) : 465 - 501
  • [13] Multiple Bayesian discriminant functions for high-dimensional massive data classification
    Jianfei Zhang
    Shengrui Wang
    Lifei Chen
    Patrick Gallinari
    Data Mining and Knowledge Discovery, 2017, 31 : 465 - 501
  • [14] Hybrid Dimensionality Reduction Forest With Pruning for High-Dimensional Data Classification
    Chen, Weihong
    Xu, Yuhong
    Yu, Zhiwen
    Cao, Wenming
    Chen, C. L. Philip
    Han, Guoqiang
    IEEE ACCESS, 2020, 8 : 40138 - 40150
  • [15] Research of Medical High-dimensional Imbalanced Data Classification-Ensemble Feature Selection Algorithm with Random Forest
    Zhu, Min
    Su, Bo
    Ning, Gangmin
    2017 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA), 2017, : 273 - 277
  • [16] Bayesian clinical classification from high-dimensional data: Signatures versus variability
    Shalabi, Akram
    Inoue, Masato
    Watkins, Johnathan
    De Rinaldis, Emanuele
    Coolen, Anthony C. C.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2018, 27 (02) : 336 - 351
  • [17] The Visualization of E-commerce High-dimensional Data Based on Random Forest
    Zhu Xianwen
    Yin Hongtan
    AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (01): : 987 - 991
  • [18] The visualization of e-commerce high-dimensional data based on random forest
    Xianwen, Zhu, 1600, TeknoScienze, Viale Brianza,22, Milano, 20127, Italy (28):
  • [19] A classification algorithm for high-dimensional data
    Roy, Asim
    INNS CONFERENCE ON BIG DATA 2015 PROGRAM, 2015, 53 : 345 - 355
  • [20] GA-optimized random forest classification for high dimensional data
    Pan, Jingchang
    Wei, Peng
    Guo, Qiang
    Zhang, Caiming
    Luo, Ali
    ICIC Express Letters, 2011, 5 (05): : 1529 - 1534