共 50 条
A flexible approach for predictive biomarker discovery
被引:3
|作者:
Boileau, Philippe
[1
,2
]
Qi, Nina Ting
[3
]
van der Laan, Mark J.
[4
]
Dudoit, Sandrine
[4
]
Leng, Ning
[3
]
机构:
[1] Univ Calif Berkeley, Grad Grp Biostat, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Ctr Computat Biol, Berkeley, CA 94720 USA
[3] Genentech Inc, 1 DNA Way, San Francisco, CA 94080 USA
[4] Univ Calif Berkeley, Ctr Computat Biol, Dept Stat, Div Biostat, Berkeley, CA 94720 USA
基金:
加拿大自然科学与工程研究理事会;
关键词:
Heterogeneous treatment effects;
High-dimensional data;
Nonparametric statistics;
Predictive biomarkers;
Precision medicine;
Variable importance;
VARIABLE SELECTION;
MODELS;
D O I:
10.1093/biostatistics/kxac029
中图分类号:
Q [生物科学];
学科分类号:
07 ;
0710 ;
09 ;
摘要:
An endeavor central to precision medicine is predictive biomarker discovery; they define patient sub-populations which stand to benefit most, or least, from a given treatment. The identification of these biomarkers is often the byproduct of the related but fundamentally different task of treatment rule estimation. Using treatment rule estimation methods to identify predictive biomarkers in clinical trials where the number of covariates exceeds the number of participants often results in high false discovery rates. The higher than expected number of false positives translates to wasted resources when conducting follow-up experiments for drug target identification and diagnostic assay development. Patient outcomes are in turn negatively affected. We propose a variable importance parameter for directly assessing the importance of potentially predictive biomarkers and develop a flexible nonparametric inference procedure for this estimand. We prove that our estimator is double robust and asymptotically linear under loose conditions in the data-generating process, permitting valid inference about the importance metric. The statistical guarantees of the method are verified in a thorough simulation study representative of randomized control trials with moderate and high-dimensional covariate vectors. Our procedure is then used to discover predictive biomarkers from among the tumor gene expression data of metastatic renal cell carcinoma patients enrolled in recently completed clinical trials. We find that our approach more readily discerns predictive from nonpredictive biomarkers than procedures whose primary purpose is treatment rule estimation. An open-source software implementation of the methodology, the uniCATE R package, is briefly introduced.
引用
收藏
页码:1085 / 1105
页数:21
相关论文