A simulation study comparing the power of nine tests of the treatment effect in randomized controlled trials with a time-to-event outcome

被引:20
|
作者
Royston, Patrick [1 ]
Parmar, Mahesh K. B. [1 ]
机构
[1] UCL, Inst Clin Trials & Methodol, MRC Clin Trials Unit, 90 High Holborn, London WC1V 6LJ, England
基金
英国医学研究理事会;
关键词
Randomized controlled trials; Time-to-event outcome; Logrank test; Hazard ratio; Non-proportional hazards; Versatile test; Power; Simulation; Robustness; SAMPLE-SIZE ANALYSIS; PARMAR COMBINED TEST; LOG-RANK; CLINICAL-TRIALS; VERSATILE TESTS; SURVIVAL; HAZARDS;
D O I
10.1186/s13063-020-4153-2
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Background The logrank test is routinely applied to design and analyse randomized controlled trials (RCTs) with time-to-event outcomes. Sample size and power calculations assume the treatment effect follows proportional hazards (PH). If the PH assumption is false, power is reduced and interpretation of the hazard ratio (HR) as the estimated treatment effect is compromised. Using statistical simulation, we investigated the type 1 error and power of the logrank (LR)test and eight alternatives. We aimed to identify test(s) that improve power with three types of non-proportional hazards (non-PH): early, late or near-PH treatment effects. Methods We investigated weighted logrank tests (early, LRE; late, LRL), the supremum logrank test (SupLR) and composite tests (joint, J; combined, C; weighted combined, WC; versatile and modified versatile weighted logrank, VWLR, VWLR2) with two or more components. Weighted logrank tests are intended to be sensitive to particular non-PH patterns. Composite tests attempt to improve power across a wider range of non-PH patterns. Using extensive simulations based on real trials, we studied test size and power under PH and under simple departures from PH comprising pointwise constant HRs with a single change point at various follow-up times. We systematically investigated the influence of high or low control-arm event rates on power. Results With no preconceived type of treatment effect, the preferred test is VWLR2. Expecting an early effect, tests with acceptable power are SupLR, C, VWLR2, J, LRE and WC. Expecting a late effect, acceptable tests are LRL, VWLR, VWLR2, WC and J. Under near-PH, acceptable tests are LR, LRE, VWLR, C, VWLR2 and SupLR. Type 1 error was well controlled for all tests, showing only minor deviations from the nominal 5%. The location of the HR change point relative to the cumulative proportion of control-arm events considerably affected power. Conclusions Assuming ignorance of the likely treatment effect, the best choice is VWLR2. Several non-standard tests performed well when the correct type of treatment effect was assumed. A low control-arm event rate reduced the power of weighted logrank tests targeting early effects. Test size was generally well controlled. Further investigation of test characteristics with different types of non-proportional hazards of the treatment effect is warranted.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Sensitivity Analysis of Per-Protocol Time-to-Event Treatment Efficacy in Randomized Clinical Trials
    Gilbert, Peter B.
    Shepherd, Bryan E.
    Hudgens, Michael G.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2013, 108 (503) : 789 - 800
  • [22] Incorporating prognostic factors into causal estimators: A comparison of methods for randomised controlled trials with a time-to-event outcome
    Hampson, Lisa V.
    Metcalfe, Chris
    [J]. STATISTICS IN MEDICINE, 2012, 31 (26) : 3073 - 3088
  • [23] An audit strategy for time-to-event outcomes measured with error: Application to five randomized controlled trials in oncology
    Dodd, Lori E.
    Korn, Edward L.
    Freidlin, Boris
    Gu, Wenjuan
    Abrams, Jeffrey S.
    Bushnell, William D.
    Canetta, Renzo
    Doroshow, James H.
    Gray, Robert J.
    Sridhara, Rajeshwari
    [J]. CLINICAL TRIALS, 2013, 10 (05) : 754 - 760
  • [24] Rejoinder for discussions on correct and logical causal inference for binary and time-to-event outcomes in randomized controlled trials
    Liu, Yi
    Wang, Bushi
    Tian, Hong
    Hsu, Jason C.
    [J]. BIOMETRICAL JOURNAL, 2022, 64 (02) : 246 - 255
  • [25] Counterfactual mediation analysis in the multistate model framework for surrogate and clinical time-to-event outcomes in randomized controlled trials
    Weir, Isabelle R.
    Rider, Jennifer R.
    Trinquart, Ludovic
    [J]. PHARMACEUTICAL STATISTICS, 2022, 21 (01) : 163 - 175
  • [26] Reporting Time-to-Event Endpoints and Response Rates in 4 Decades of Randomized Controlled Trials in Advanced Colorectal Cancer
    Arkenau, Hendrik-Tobias
    Nordman, Ina
    Dobbins, Timothy
    Ward, Robyn
    [J]. CANCER, 2011, 117 (04) : 832 - 840
  • [27] Remission and response in the treatment of bipolar depression: Time-To-Event and NNT analyses from a large, randomized, controlled study of quetiapine
    Cookson, JC
    Keck, PE
    Ketter, TA
    Macfadden, W
    Minkwitz, M
    Mullen, J
    [J]. EUROPEAN PSYCHIATRY, 2005, 20 : S141 - S141
  • [28] Power and sample-size analysis for the Royston Parmar combined test in clinical trials with a time-to-event outcome
    Royston, Patrick
    [J]. STATA JOURNAL, 2018, 18 (01): : 3 - 21
  • [29] Statistical power in parallel group point exposure studies with time-to-event outcomes: an empirical comparison of the performance of randomized controlled trials and the inverse probability of treatment weighting (IPTW) approach
    Austin, Peter C.
    Schuster, Tibor
    Platt, Robert W.
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2015, 15
  • [30] Statistical power in parallel group point exposure studies with time-to-event outcomes: an empirical comparison of the performance of randomized controlled trials and the inverse probability of treatment weighting (IPTW) approach
    Peter C. Austin
    Tibor Schuster
    Robert W. Platt
    [J]. BMC Medical Research Methodology, 15