Exploiting Censored Information in Self-Training for Time-to-Event Prediction

被引:0
|
作者
Haredasht, Fateme Nateghi [1 ,2 ]
Dauda, Kazeem Adesina [1 ,2 ,3 ]
Vens, Celine [1 ,2 ]
机构
[1] Katholieke Univ Leuven, Dept Publ Hlth & Primary Care, Campus KULAK, B-8500 Kortrijk, Belgium
[2] Katholieke Univ Leuven, ITEC imec, B-8500 Kortrijk, Belgium
[3] Kwara State Univ, Dept Math & Stat, Malete 241103, Nigeria
关键词
Random survival forest; self-training; semi-supervised learning; survival analysis; VARIABLE SELECTION; SURVIVAL; REGRESSION;
D O I
10.1109/ACCESS.2023.3312310
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A common problem in medical applications is predicting the time until an event of interest such as the onset of a disease, time to tumor recurrence, and time to mortality. Traditionally, classical survival analysis techniques have been used to address this problem. However, these techniques are of limited usage when considering nonlinear and interaction effects among biomarkers, and high profiling survival datasets. Although supervised machine learning techniques have shown some advantages over standard statistical methods in handling high-dimensional datasets, their application to survival analysis, particularly in the context of feature-based approaches, is at best limited. A major reason behind this is the difficulty in processing censored data, which is a common component of survival analysis. In this paper, we have transformed the time-to-event prediction problem into a semi-supervised regression problem. We utilize a self-training wrapper approach, where an outer layer guides the iterative refinement of predictions. This approach enhances the performance of our model by leveraging confident predictions from censored instances. The self-training wrapper is applied in conjunction with random survival forests as the base learner. In this approach, censored observations are introduced as partially labeled observations since their predicted time (target value) should exceed the censoring time. First, the algorithm builds a base model over the observed instances and then augments them iteratively with highly confident predictions over the censored set, using a smart stopping criterion based on the censoring time. The proposed approach has been evaluated and compared on fifteen real-world survival analysis datasets, including clinical and high-dimensional data. The ability of our proposed approach to integrate partial supervision information within a semi-supervised learning strategy has enabled it to achieve competitive performance compared to baseline models, particularly in the case of a high-dimensional regime.
引用
下载
收藏
页码:96831 / 96840
页数:10
相关论文
共 50 条
  • [1] Differentiable sorting for censored time-to-event data
    Vauvelle, Andre
    Wild, Benjamin
    Eils, Roland
    Denaxas, Spiros
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Prediction Accuracy Measures for a Nonlinear Model and for Right-Censored Time-to-Event Data
    Li, Gang
    Wang, Xiaoyan
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2019, 114 (528) : 1815 - 1825
  • [3] Genomic architecture and prediction of censored time-to-event phenotypes with a Bayesian genome-wide analysis
    Sven E. Ojavee
    Athanasios Kousathanas
    Daniel Trejo Banos
    Etienne J. Orliac
    Marion Patxot
    Kristi Läll
    Reedik Mägi
    Krista Fischer
    Zoltan Kutalik
    Matthew R. Robinson
    Nature Communications, 12
  • [4] Genomic architecture and prediction of censored time-to-event phenotypes with a Bayesian genome-wide analysis
    Ojavee, Sven E.
    Kousathanas, Athanasios
    Trejo Banos, Daniel
    Orliac, Etienne J.
    Patxot, Marion
    Laell, Kristi
    Maegi, Reedik
    Fischer, Krista
    Kutalik, Zoltan
    Robinson, Matthew R.
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [5] A Naive Bayes machine learning approach to risk prediction using censored, time-to-event data
    Wolfson, Julian
    Bandyopadhyay, Sunayan
    Elidrisi, Mohamed
    Vazquez-Benitez, Gabriela
    Vock, David M.
    Musgrove, Donald
    Adomavicius, Gediminas
    Johnson, Paul E.
    O'Connor, Patrick J.
    STATISTICS IN MEDICINE, 2015, 34 (21) : 2941 - 2957
  • [6] An ensemble method for interval-censored time-to-event data
    Yao, Weichi
    Frydman, Halina
    Simonoff, Jeffrey S.
    BIOSTATISTICS, 2021, 22 (01) : 198 - 213
  • [7] Estimation of a Concordance Probability for Doubly Censored Time-to-Event Data
    Hayashi K.
    Shimizu Y.
    Statistics in Biosciences, 2018, 10 (3) : 546 - 567
  • [8] Discriminative Self-training for Punctuation Prediction
    Chen, Qian
    Wang, Wen
    Chen, Mengzhe
    Zhang, Qinglin
    INTERSPEECH 2021, 2021, : 771 - 775
  • [9] BIAS REDUCTION IN ESTIMATING A CONCORDANCE FOR CENSORED TIME-TO-EVENT RESPONSES
    Hayashi, Kenichi
    JOURNAL JAPANESE SOCIETY OF COMPUTATIONAL STATISTICS, 2014, 27 (01): : 1 - 16
  • [10] Test-time adaptation via self-training with future information
    Wen, Xin
    Shen, Hao
    Zhao, Zhongqiu
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (03) : 33012