The bootstrap: A technique for data-driven statistics. Using computer-intensive analyses to explore experimental data

被引:203
|
作者
Henderson, AR [1 ]
机构
[1] Univ Western Ontario, Dept Biochem, London, ON N6A 5C1, Canada
关键词
bootstrap; computer-intensive methods; jackknife; non-parametric statistics; permutation tests; random number generation;
D O I
10.1016/j.cccn.2005.04.002
中图分类号
R446 [实验室诊断]; R-33 [实验医学、医学实验];
学科分类号
1001 ;
摘要
Background: The concept of resampling data - more commonly referred to as bootstrapping - has been in use for more than three decades. Bootstrapping has considerable theoretical advantages when it is applied to non-Gaussian data. Most of the published literature is concerned with the mathematical aspects of the bootstrap but increasingly this technique is being utilized in medical and other fields. Methods: I reviewed the published literature following a 1994 publication assessing the transfer of technology, including the bootstrap, to the biomedical literature. Results: In the ten-year period following that 1994 paper there were 1679 published references to the technique in Medline. In that same time period the following citations were found in the four major medical journals-British Medical Journal (48), JAMA (51), Lancet (52) and the New England Journal of Medicine (45). Content: I introduce the basic theory of the bootstrap, the jackknife, and permutation tests. The bootstrap is used to estimate the accuracy of an estimator such as the standard error, a confidence interval, or the bias of an estimator. The technique may be useful for analysing smallish expensive-to-collect data sets where prior information is sparse, distributional assumptions are unclear, and where further data may be difficult to acquire. Some of the elementary uses of bootstrapping are illustrated by considering the calculation of confidence intervals such as for reference ranges or for experimental data findings, hypothesis testing such as comparing experimental findings, linear regression, and correlation when studying association and prediction of variables, non-linear regression such as used in immunoassay techniques, and ROC curve processing. Conclusions: These techniques can supplement current nonparametric statistical methods and should be included, where appropriate, in the armamentarium of data processing methodologies. (c) 2005 Elsevier B.V All rights reserved.
引用
收藏
页码:1 / 26
页数:26
相关论文
共 50 条
  • [31] Data-driven experimental design and model development using Gaussian process with active learning
    Chang, Jorge
    Kim, Jiseob
    Zhang, Byoung-Tak
    Pitt, Mark A.
    Myung, Jay I.
    COGNITIVE PSYCHOLOGY, 2021, 125
  • [32] The Effects of an Experimental Course Using Data-Driven Learning Approach in Chinese as a Second Language
    Chang, Li-ping
    Tseng, Yuting
    JOURNAL OF TECHNOLOGY AND CHINESE LANGUAGE TEACHING, 2023, 14 (01): : 1 - 25
  • [33] Computer-assisted diagnosis of breast cancer using a data-driven Bayesian belief network
    Wang, XH
    Zheng, B
    Good, WF
    King, JL
    Chang, YH
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 1999, 54 (02) : 115 - 126
  • [34] A Data-Driven Reinforcement Learning Enabled Battery Fast Charging Optimization Using Real-World Experimental Data
    He, Jiarui
    Yang, Tianyi
    Xie, Ling
    Yang, Yikun
    Chen, Chunlin
    Wei, Jingwen
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024, : 1 - 9
  • [35] Experimental Data-Driven Insertion Force Analyses of Hypodermic Needles in a Soft Tissue with an In-House Test Bench
    Chavez Pereda, Erick D.
    Loaiza Duque, Julian D.
    Ceron Hurtado, Maria A.
    Gonzalez Rojas, Hernan A.
    Sanchez Egea, Antonio J.
    APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2022, 2022, 1685 : 415 - 422
  • [36] Antimicrobial Resistance Prediction in Intensive Care Unit for Pseudomonas Aeruginosa using Temporal Data-Driven Models
    Hernandez-Carnerero, Alvar
    Sanchez-Marre, Miquel
    Mora-Jimenez, Inmaculada
    Soguero-Ruiz, Cristina
    Martinez-Aguero, Sergio
    Alvarez-Rodriguez, Joaquin
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (05): : 119 - 133
  • [37] Data-driven technique for disruption prediction in GOLEM tokamak using stacked ensembles with active learning
    Chandrasekaran, Jayakumar
    Jayaraman, Sangeetha
    REVIEW OF SCIENTIFIC INSTRUMENTS, 2022, 93 (03):
  • [38] Data-driven Uncertainty Quantification of the Wave Telescope Technique: General Equations and Demonstration Using HelioSwarm
    Broeren, T.
    Klein, K. G.
    ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2023, 266 (01):
  • [39] A Non-Invasive Hydration Monitoring Technique Using Microwave Transmission and Data-Driven Approaches
    Agarwal, Deepesh
    Randall, Philip
    White, Zachary
    Bisnette, Bayleigh
    Dickson, Jenalee
    Allen, Cross
    Chamani, Faraz
    Prakash, Punit
    Ade, Carl
    Natarajan, Balasubramaniam
    SENSORS, 2022, 22 (07)
  • [40] A Data-Driven Prognostics Technique and RUL Prediction of Rotating Machines Using an Exponential Degradation Model
    Bejaoui, Islem
    Bruneo, Dario
    Xibilia, Maria Gabriella
    2020 7TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'20), VOL 1, 2020, : 703 - 708