Generalized Jaccard feature screening for ultra-high dimensional survival data

被引:0
|
作者
Liu, Renqing [1 ]
Deng, Guangming [1 ,2 ]
He, Hanji [3 ]
机构
[1] Guilin Univ Technol, Sch Math & Stat, Guilin 541004, Peoples R China
[2] Guangxi Coll & Univ, Key Lab Appl Stat, Guilin 541004, Peoples R China
[3] South China Univ Technol, Sch Econ & Finance, Guangzhou 510006, Peoples R China
来源
AIMS MATHEMATICS | 2024年 / 9卷 / 10期
关键词
generalized Jaccard coefficient; ultra-high dimensional survival data; model-free; VARIABLE SELECTION; LINEAR-MODELS;
D O I
10.3934/math.20241341
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
To identify critical genomes that influence a cancer patient's survival time, feature screening methods play a vital role in this biomedical field. Most of the current research relies on a fixed survival function model, which limits its universality in practical applications. In this paper, we propose the Generalized Jaccard coefficient (GJAC), which extends the traditional Jaccard coefficient from comparing binary vectors' similarity to calculating the correlation between the general vectors. The larger the GJAC value, the higher the sample similarity. Using the GJAC, we introduce a novel model-free screening method to select the active set of covariates in ultra-high dimensional survival data. Through Monte Carlo simulations, GJAC-Sure Independence Screening (GJAC-SIS) shows a higher accuracy, lower errors, and an excellent applicability in different types of survival data compared with other existing model-free feature screening methods in survival data. Additionally, in the real cancer datasets (DLBCL), GJAC-SIS can screen out two additional important genomes, which are certified in the real biomedical experiment, while the other five methods can't. As a result, GJAC-SIS achieves a high screening precision, delivers a more effective screening outcome, and has a better utility and universality.
引用
收藏
页码:27607 / 27626
页数:20
相关论文
共 50 条
  • [1] Sequential Feature Screening for Generalized Linear Models with Sparse Ultra-High Dimensional Data
    Junying Zhang
    Hang Wang
    Riquan Zhang
    Jiajia Zhang
    [J]. Journal of Systems Science and Complexity, 2020, 33 : 510 - 526
  • [2] Sequential Feature Screening for Generalized Linear Models with Sparse Ultra-High Dimensional Data
    ZHANG Junying
    WANG Hang
    ZHANG Riquan
    ZHANG Jiajia
    [J]. Journal of Systems Science & Complexity, 2020, 33 (02) : 510 - 526
  • [3] Sequential Feature Screening for Generalized Linear Models with Sparse Ultra-High Dimensional Data
    Zhang, Junying
    Wang, Hang
    Zhang, Riquan
    Zhang, Jiajia
    [J]. JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2020, 33 (02) : 510 - 526
  • [4] Grouped feature screening for ultra-high dimensional data for the classification model
    He, Hanji
    Deng, Guangming
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2022, 92 (05) : 974 - 997
  • [5] Model-free feature screening for ultra-high dimensional competing risks data
    Chen, Xiaolin
    Zhang, Yahui
    Liu, Yi
    Chen, Xiaojing
    [J]. STATISTICS & PROBABILITY LETTERS, 2020, 164
  • [6] Adjusted feature screening for ultra-high dimensional missing response
    Zou, Liying
    Liu, Yi
    Zhang, Zhonghu
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (03) : 460 - 483
  • [7] Improvement Screening for Ultra-High Dimensional Data with Censored Survival Outcomes and Varying Coefficients
    Yue, Mu
    Li, Jialiang
    [J]. INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2017, 13 (01):
  • [8] Conditional distance correlation sure independence screening for ultra-high dimensional survival data
    Lu, Shuiyun
    Chen, Xiaolin
    Wang, Hong
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (08) : 1936 - 1953
  • [9] Conditional screening for ultra-high dimensional covariates with survival outcomes
    Hyokyoung G. Hong
    Jian Kang
    Yi Li
    [J]. Lifetime Data Analysis, 2018, 24 : 45 - 71
  • [10] Conditional screening for ultra-high dimensional covariates with survival outcomes
    Hong, Hyokyoung G.
    Kang, Jian
    Li, Yi
    [J]. LIFETIME DATA ANALYSIS, 2018, 24 (01) : 45 - 71