Bayesian weighted random forest for classification of high-dimensional genomics data

被引:7
|
作者
Olaniran, Oyebayo Ridwan [1 ]
Abdullah, Mohd Asrul A. [2 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
[2] UTHM, Dept Math & Stat, FAST, Parit Raja, Johor, Malaysia
关键词
Bayesian; High-dimensional; Genomic data; Classifcation; Random forest; VARIABLE SELECTION; BREAST-CANCER; GENE; PREDICTION; TUMOR; PATTERNS; LEUKEMIA;
D O I
10.1016/j.kjs.2023.06.008
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, a full Bayesian weighted probabilistic model is developed for random classification trees. The new model Bayesian Weighted Random Classification Forest (BWRCF) arises from the modification of the existing random classification forest in two ways. Firstly, the tree terminal node estimation procedure is replaced with a Bayesian estimation approach. Secondly, a new variable ranking procedure is developed and then hybridized with BWRCF to tackle the high-dimensionality issues. The performance of the proposed method is analyzed using simulated and real-life high-dimensional microarray datasets based on holdout accuracy and misclassification error rates. The results of the analyses showed that the proposed BWRCF is robust in terms of its ability to withstand moderate to large high-dimensionality scenarios. In addition, BWRCF also has improved predictive and efficiency abilities over selected competing methods.
引用
收藏
页码:477 / 484
页数:8
相关论文
共 50 条
  • [21] Bayesian Conditional Tensor Factorizations for High-Dimensional Classification
    Yang, Yun
    Dunson, David B.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (514) : 656 - 669
  • [22] ESTIMATING BAYESIAN NETWORKS FOR HIGH-DIMENSIONAL DATA WITH COMPLEX MEAN STRUCTURE AND RANDOM EFFECTS
    Kasza, Jessica
    Glonek, Gary
    Solomon, Patty
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2012, 54 (02) : 169 - 187
  • [23] Knowledge-Guided Bayesian Support Vector Machine for High-Dimensional Data with Application to Analysis of Genomics Data
    Sun, Wenli
    Chang, Changgee
    Zhao, Yize
    Long, Qi
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 1484 - 1493
  • [24] Sparse bayesian kernel multinomial probit regression model for high-dimensional data classification
    Yang, Aijun
    Jiang, Xuejun
    Shu, Lianjie
    Liu, Pengfei
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (01) : 165 - 176
  • [25] Random forests for high-dimensional longitudinal data
    Capitaine, Louis
    Genuer, Robin
    Thiebaut, Rodolphe
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (01) : 166 - 184
  • [26] Fuzzy Forests: Extending Random Forest Feature Selection for Correlated, High-Dimensional Data
    Conn, Daniel
    Ngun, Tuck
    Li, Gang
    Ramirez, Christina M.
    JOURNAL OF STATISTICAL SOFTWARE, 2019, 91 (09):
  • [27] Classification methods for high-dimensional genetic data
    Kalina, Jan
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2014, 34 (01) : 10 - 18
  • [28] Online Nonlinear Classification for High-Dimensional Data
    Vanli, N. Denizcan
    Ozkan, Huseyin
    Delibalta, Ibrahim
    Kozat, Suleyman S.
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 685 - 688
  • [29] Enhanced algorithm for high-dimensional data classification
    Wang, Xiaoming
    Wang, Shitong
    APPLIED SOFT COMPUTING, 2016, 40 : 1 - 9
  • [30] CLASSIFICATION OF HIGH-DIMENSIONAL DATA: A RANDOM-MATRIX REGULARIZED DISCRIMINANT ANALYSIS APPROACH
    Ye, Bin
    Liu, Peng
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (03): : 955 - 967