The bootstrap: A technique for data-driven statistics. Using computer-intensive analyses to explore experimental data

被引:203
|
作者
Henderson, AR [1 ]
机构
[1] Univ Western Ontario, Dept Biochem, London, ON N6A 5C1, Canada
关键词
bootstrap; computer-intensive methods; jackknife; non-parametric statistics; permutation tests; random number generation;
D O I
10.1016/j.cccn.2005.04.002
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 ;
摘要
Background: The concept of resampling data - more commonly referred to as bootstrapping - has been in use for more than three decades. Bootstrapping has considerable theoretical advantages when it is applied to non-Gaussian data. Most of the published literature is concerned with the mathematical aspects of the bootstrap but increasingly this technique is being utilized in medical and other fields. Methods: I reviewed the published literature following a 1994 publication assessing the transfer of technology, including the bootstrap, to the biomedical literature. Results: In the ten-year period following that 1994 paper there were 1679 published references to the technique in Medline. In that same time period the following citations were found in the four major medical journals-British Medical Journal (48), JAMA (51), Lancet (52) and the New England Journal of Medicine (45). Content: I introduce the basic theory of the bootstrap, the jackknife, and permutation tests. The bootstrap is used to estimate the accuracy of an estimator such as the standard error, a confidence interval, or the bias of an estimator. The technique may be useful for analysing smallish expensive-to-collect data sets where prior information is sparse, distributional assumptions are unclear, and where further data may be difficult to acquire. Some of the elementary uses of bootstrapping are illustrated by considering the calculation of confidence intervals such as for reference ranges or for experimental data findings, hypothesis testing such as comparing experimental findings, linear regression, and correlation when studying association and prediction of variables, non-linear regression such as used in immunoassay techniques, and ROC curve processing. Conclusions: These techniques can supplement current nonparametric statistical methods and should be included, where appropriate, in the armamentarium of data processing methodologies. (c) 2005 Elsevier B.V All rights reserved.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [41] Data-driven system health monitoring technique using autoencoder for the safety management of commercial aircraft
    Lee, Hyunseong
    Lim, Hyung Jin
    Chattopadhyay, Aditi
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (08): : 3235 - 3250
  • [42] Data-driven system health monitoring technique using autoencoder for the safety management of commercial aircraft
    Hyunseong Lee
    Hyung Jin Lim
    Aditi Chattopadhyay
    Neural Computing and Applications, 2021, 33 : 3235 - 3250
  • [43] Data-driven computational approaches to estimate gross calorific value of coal using proximate and ultimate analyses
    Munshi, Tanveer Alam
    Jahan, Labiba Nusrat
    Howladar, M. Farhad
    Hashan, Mahamudul
    INTERNATIONAL JOURNAL OF COAL PREPARATION AND UTILIZATION, 2024, 44 (10) : 1653 - 1678
  • [44] Monitoring of Damage in Composite Structures Using an Optimized Sensor Network: A Data-Driven Experimental Approach
    Rucevskis, Sandris
    Rogala, Tomasz
    Katunin, Andrzej
    SENSORS, 2023, 23 (04)
  • [45] A Data-Driven Response Virtual Sensor Technique with Partial Vibration Measurements Using Convolutional Neural Network
    Sun, Shan-Bin
    He, Yuan-Yuan
    Zhou, Si-Da
    Yue, Zhen-Jiang
    SENSORS, 2017, 17 (12)
  • [46] A DATA-DRIVEN PHONEME MAPPING TECHNIQUE USING INTERPOLATION VECTORS OF PHONE-CLUSTER ADAPTIVE TRAINING
    Abraham, Basil
    Joy, Neethu Mariam
    Navneeth, K.
    Umesh, S.
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 36 - 41
  • [47] Data-Driven Diabetes Risk Factor Prediction Using Machine Learning Algorithms with Feature Selection Technique
    Kakoly, Israt Jahan
    Hoque, Md. Rakibul
    Hasan, Najmul
    SUSTAINABILITY, 2023, 15 (06)
  • [48] A state of the art review on the prediction of building energy consumption using data-driven technique and evolutionary algorithms
    Li, Kangji
    Xue, Wenping
    Tan, Gang
    Denzer, Anthony S.
    BUILDING SERVICES ENGINEERING RESEARCH & TECHNOLOGY, 2020, 41 (01): : 108 - 127
  • [49] A data-driven fMRI analysis method using temporal clustering technique and an adaptive voxel selection criterion
    Lee, Sarah
    Zelaya, Fernando
    Amiel, Stephanie A.
    Brammer, Michael J.
    WORLD CONGRESS ON ENGINEERING 2007, VOLS 1 AND 2, 2007, : 1411 - +
  • [50] A Detection and Isolation of Faults Technique in Automotive Engines Using a Data-Driven and Model-Based Approach
    Wang, Yingmin
    Cui, Dong
    Guo, Feng
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON MODELLING, IDENTIFICATION AND CONTROL (ICMIC2019), 2020, 582 : 503 - 520