Bayesian weighted random forest for classification of high-dimensional genomics data

被引:7
|
作者
Olaniran, Oyebayo Ridwan [1 ]
Abdullah, Mohd Asrul A. [2 ]
机构
[1] Univ Ilorin, Dept Stat, Ilorin, Nigeria
[2] UTHM, Dept Math & Stat, FAST, Parit Raja, Johor, Malaysia
关键词
Bayesian; High-dimensional; Genomic data; Classifcation; Random forest; VARIABLE SELECTION; BREAST-CANCER; GENE; PREDICTION; TUMOR; PATTERNS; LEUKEMIA;
D O I
10.1016/j.kjs.2023.06.008
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, a full Bayesian weighted probabilistic model is developed for random classification trees. The new model Bayesian Weighted Random Classification Forest (BWRCF) arises from the modification of the existing random classification forest in two ways. Firstly, the tree terminal node estimation procedure is replaced with a Bayesian estimation approach. Secondly, a new variable ranking procedure is developed and then hybridized with BWRCF to tackle the high-dimensionality issues. The performance of the proposed method is analyzed using simulated and real-life high-dimensional microarray datasets based on holdout accuracy and misclassification error rates. The results of the analyses showed that the proposed BWRCF is robust in terms of its ability to withstand moderate to large high-dimensionality scenarios. In addition, BWRCF also has improved predictive and efficiency abilities over selected competing methods.
引用
收藏
页码:477 / 484
页数:8
相关论文
共 50 条
  • [1] Laplacian-Weighted Random Forest for High-Dimensional Data Classification
    Liang, Jianheng
    Huang, Dong
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 748 - 753
  • [2] BayesRandomForest: An R implementation of Bayesian Random Forest for Regression Analysis of High-dimensional Data
    Olaniran, Oyebayo Ridwan
    Bin Abdullah, Mohd Asrul Affendi
    ROMANIAN STATISTICAL REVIEW, 2018, (01) : 95 - 102
  • [3] High-Dimensional Data in Genomics
    Amaratunga, Dhammika
    Cabrera, Javier
    BIOPHARMACEUTICAL APPLIED STATISTICS SYMPOSIUM, VOL 3: PHARMACEUTICAL APPLICATIONS, 2018, : 65 - 73
  • [4] iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data
    Wang, Wenting
    Baladandayuthapani, Veerabhadran
    Morris, Jeffrey S.
    Broom, Bradley M.
    Manyam, Ganiraju
    Do, Kim-Anh
    BIOINFORMATICS, 2013, 29 (02) : 149 - 159
  • [5] Random Forest Modelling of High-Dimensional Mixed-Type Data for Breast Cancer Classification
    Quist, Jelmar
    Taylor, Lawson
    Staaf, Johan
    Grigoriadis, Anita
    CANCERS, 2021, 13 (05) : 1 - 15
  • [6] The Application of high-dimensional Data Classification by Random Forest based on Hadoop Cloud Computing Platform
    Li, Chong
    3RD INTERNATIONAL CONFERENCE ON APPLIED ENGINEERING, 2016, 51 : 385 - 390
  • [7] Bayesian Random Forest with Multiple Imputation by Chain Equations for High-Dimensional Missing Data: A Simulation Study
    Olaniran, Oyebayo Ridwan
    Alzahrani, Ali Rashash R.
    MATHEMATICS, 2025, 13 (06)
  • [8] On the Oracle Properties of Bayesian Random Forest for Sparse High-Dimensional Gaussian Regression
    Olaniran, Oyebayo Ridwan
    Alzahrani, Ali Rashash R.
    MATHEMATICS, 2023, 11 (24)
  • [9] Bayesian shrinkage models for integration and analysis of multiplatform high-dimensional genomics data
    Xue, Hao
    Chakraborty, Sounak
    Dey, Tanujit
    STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (02)
  • [10] Classification by ensembles from random partitions of high-dimensional data
    Ahn, Hongshik
    Moon, Hojin
    Fazzari, Melissa J.
    Lim, Noha
    Chen, James J.
    Kodell, Ralph L.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (12) : 6166 - 6179