Rank-Based Greedy Model Averaging for High-Dimensional Survival Data

被引:3
|
作者
He, Baihua [1 ]
Ma, Shuangge [2 ]
Zhang, Xinyu [1 ,3 ]
Zhu, Li-Xing [4 ,5 ]
机构
[1] Univ Sci & Technol China, Sch Management, Int Inst Finance, Hefei, Peoples R China
[2] Yale Univ, Dept Biostat, New Haven, CT USA
[3] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[4] Beijing Normal Univ Zhuhai, Ctr Stat & Data Sci, Zhuhai, Peoples R China
[5] Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
基金
美国国家卫生研究院; 中国国家自然科学基金;
关键词
Greedy algorithm; High-dimensional survival data; Model averaging; Prediction; Smooth concordance index; CONFIDENCE-INTERVALS; SELECTION; CONSISTENCY;
D O I
10.1080/01621459.2022.2070070
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Model averaging is an effective way to enhance prediction accuracy. However, most previous works focus on low-dimensional settings with completely observed responses. To attain an accurate prediction for the risk effect of survival data with high-dimensional predictors, we propose a novel method: rank-based greedy (RG) model averaging. Specifically, adopting the transformation model with splitting predictors as working models, we doubly use the smooth concordance index function to derive the candidate predictions and optimal model weights. The final prediction is achieved by weighted averaging all the candidates. Our approach is flexible, computationally efficient, and robust against model misspecification, as it neither requires the correctness of a joint model nor involves the estimation of the transformation function. We further adopt the greedy algorithm for high dimensions. Theoretically, we derive an asymptotic error bound for the optimal weights under some mild conditions. In addition, the summation of weights assigned to the correct candidate submodels is proven to approach one in probability when there are correct models included among the candidate submodels. Extensive numerical studies are carried out using both simulated and real datasets to show the proposed approach's robust performance compared to the existing regularization approaches. Supplementary materials for this article are available online.
引用
收藏
页码:2658 / 2670
页数:13
相关论文
共 50 条
  • [1] High-dimensional rank-based inference
    Kong, Xiaoli
    Harrar, Solomon W.
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2020, 32 (02) : 294 - 322
  • [2] A rank-based adaptive independence test for high-dimensional data
    Shi, Xiangyu
    Cao, Ruiyuan
    Du, Jiang
    Miao, Zhuqing
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [3] Rank-based classifiers for extremely high-dimensional gene expression data
    Ludwig Lausser
    Florian Schmid
    Lyn-Rouven Schirra
    Adalbert F. X. Wilhelm
    Hans A. Kestler
    [J]. Advances in Data Analysis and Classification, 2018, 12 : 917 - 936
  • [4] Rank-based classifiers for extremely high-dimensional gene expression data
    Lausser, Ludwig
    Schmid, Florian
    Schirra, Lyn-Rouven
    Wilhelm, Adalbert F. X.
    Kestler, Hans A.
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2018, 12 (04) : 917 - 936
  • [5] Martingale-residual-based greedy model averaging for high-dimensional current status data
    Wang, Chang
    Du, Mingyue
    [J]. STATISTICS IN MEDICINE, 2024, 43 (09) : 1726 - 1742
  • [6] Rank-based test for slope homogeneity in high-dimensional panel data models
    Yanling Ding
    Binghui Liu
    Ping Zhao
    Long Feng
    [J]. Metrika, 2022, 85 : 605 - 626
  • [7] Rank-based test for slope homogeneity in high-dimensional panel data models
    Ding, Yanling
    Liu, Binghui
    Zhao, Ping
    Feng, Long
    [J]. METRIKA, 2022, 85 (05) : 605 - 626
  • [8] Rank-based lasso - Efficient methods for high-dimensional robust model selection
    Rejchel, Wojciech
    Bogdan, Malgorzata
    [J]. Journal of Machine Learning Research, 2020, 21
  • [9] Rank-based Lasso - efficient methods for high-dimensional robust model selection
    Rejchel, Wojciech
    Bogdan, Malgorzata
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [10] Rank-based score tests for high-dimensional regression coefficients
    Feng, Long
    Zou, Changliang
    Wang, Zhaojun
    Chen, Bin
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2013, 7 : 2131 - 2149