A common objective of biomarker studies is to develop a predictor of patient survival outcome. Determining the number of samples required to train a predictor from survival data is important for designing such studies. Existing sample size methods for training studies use parametric models for the high-dimensional data and cannot handle a right-censored dependent variable. We present a new training sample size method that is non-parametric with respect to the high-dimensional vectors, and is developed for a right-censored response. The method can be applied to any prediction algorithm that satisfies a set of conditions. The sample size is chosen so that the expected performance of the predictor is within a user-defined tolerance of optimal. The central method is based on a pilot dataset. To quantify uncertainty, a method to construct a confidence interval for the tolerance is developed. Adequacy of the size of the pilot dataset is discussed. An alternative model-based version of our method for estimating the tolerance when no adequate pilot dataset is available is presented. The model-based method requires a covariance matrix be specified, but we show that the identity covariance matrix provides adequate sample size when the user specifies three key quantities. Application of the sample size method to two microarray datasets is discussed.
机构:
Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R ChinaUniv Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R China
Qiu, Xintao
Fu, Dongmei
论文数: 0引用数: 0
h-index: 0
机构:
Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R ChinaUniv Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R China
Fu, Dongmei
Fu, Zhenduo
论文数: 0引用数: 0
h-index: 0
机构:
Endress Hauser Shanghai Automat Equipment Co Ltd, Res & Dev Dept, Shanghai, Peoples R ChinaUniv Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing, Peoples R China
机构:
Univ New Haven, Coll Business, 300 Boston Post Rd, West Haven, CT 06516 USAUniv New Haven, Coll Business, 300 Boston Post Rd, West Haven, CT 06516 USA
Liang, Jiajuan
Tang, Man-Lai
论文数: 0引用数: 0
h-index: 0
机构:
Hang Seng Management Coll, Sch Business, Dept Math & Stat, Hong Kong, Peoples R ChinaUniv New Haven, Coll Business, 300 Boston Post Rd, West Haven, CT 06516 USA
Tang, Man-Lai
Zhao, Xuejing
论文数: 0引用数: 0
h-index: 0
机构:
Lanzhou Univ, Sch Math & Stat, Lanzhou, Gansu, Peoples R ChinaUniv New Haven, Coll Business, 300 Boston Post Rd, West Haven, CT 06516 USA
机构:
Univ Pannonia, Dept Quantitat Methods, Egyet St 10, H-8200 Veszprem, Hungary
Inst Adv Stud iASK, Chernel St 14, H-9730 Koszeg, Hungary
Eotvos Lorand Res Network ELKH, MTA PE Budapest Ranking Res Grp, Piarista St 4, H-1052 Budapest, HungaryUniv Pannonia, Dept Quantitat Methods, Egyet St 10, H-8200 Veszprem, Hungary
Kosztyan, Zsolt T.
Kurbucz, Marcell T.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Pannonia, Dept Quantitat Methods, Egyet St 10, H-8200 Veszprem, Hungary
Wigner Res Ctr Phys, Dept Computat Sci, Konkoly Thege Mikl St 29-33, H-1121 Budapest, HungaryUniv Pannonia, Dept Quantitat Methods, Egyet St 10, H-8200 Veszprem, Hungary
Kurbucz, Marcell T.
Katona, Attila I.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Pannonia, Dept Quantitat Methods, Egyet St 10, H-8200 Veszprem, Hungary
Corvinus Univ Budapest, Dept Stat, Fovam Sq 8, H-1093 Budapest, HungaryUniv Pannonia, Dept Quantitat Methods, Egyet St 10, H-8200 Veszprem, Hungary