DataSHIELD: resolving a conflict in contemporary bioscience-performing a pooled analysis of individual-level data without sharing the data

被引:117
|
作者
Wolfson, Michael [2 ]
Wallace, Susan E. [3 ,4 ]
Masca, Nicholas [1 ,5 ]
Rowe, Geoff [2 ]
Sheehan, Nuala A. [1 ,5 ]
Ferretti, Vincent [4 ,6 ]
LaFlamme, Philippe [4 ,7 ]
Tobin, Martin D. [1 ,5 ]
Macleod, John [8 ]
Little, Julian [4 ,9 ]
Fortier, Isabel [4 ,9 ,10 ]
Knoppers, Bartha M. [3 ,4 ]
Burton, Paul R. [1 ,4 ,5 ,9 ,11 ]
机构
[1] Univ Leicester, Dept Hlth Sci, Leicester LE1 7RH, Leics, England
[2] STAT Canada, Ottawa, ON, Canada
[3] McGill Univ, Fac Med, Dept Human Genet, Ctr Genom & Policy, Montreal, PQ, Canada
[4] Publ Populat Project Genom P3G, Montreal, PQ, Canada
[5] Univ Leicester, Dept Genet, Leicester LE1 7RH, Leics, England
[6] MaRS Ctr, Ontario Inst Canc Res, Toronto, ON, Canada
[7] Genome Quebec Innovat Ctr, Montreal, PQ, Canada
[8] Univ Bristol, Dept Social Med, Bristol, Avon, England
[9] Univ Ottawa, Dept Epidemiol & Community Med, Ottawa, ON, Canada
[10] Univ Montreal, Dept Med Sociale & Prevent, Montreal, PQ, Canada
[11] McGill Univ, Dept Epidemiol Biostat & Occupat Hlth, Montreal, PQ, Canada
基金
英国医学研究理事会; 英国惠康基金;
关键词
Pooling; analysis; meta-analysis; individual-level; study-level; generalized linear model; GLM; ethico-legal; ELSI; identification; disclosure; distributed computing; bioinformatics; information technology; IT; GENOME-WIDE ASSOCIATION; INCOME INEQUALITY; COMMON VARIANTS; LOCI; SUSCEPTIBILITY; EPIDEMIOLOGY; GENE; PRIVACY;
D O I
10.1093/ije/dyq111
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Methods Data aggregation through anonymous summary-statistics from harmonized individual-level databases (DataSHIELD), provides a simple approach to analysing pooled data that circumvents this conflict. This is achieved via parallelized analysis and modern distributed computing and, in one key setting, takes advantage of the properties of the updating algorithm for generalized linear models (GLMs). Results The conceptual use of DataSHIELD is illustrated in two different settings. Conclusions As the study of the aetiological architecture of chronic diseases advances to encompass more complex causal pathways-e.g. to include the joint effects of genes, lifestyle and environment-sample size requirements will increase further and the analysis of pooled individual-level data will become ever more important. An aim of this conceptual article is to encourage others to address the challenges and opportunities that DataSHIELD presents, and to explore potential extensions, for example to its use when different data sources hold different data on the same individuals.
引用
收藏
页码:1372 / 1382
页数:11
相关论文
共 50 条
  • [31] ClustR: A Space-Time Cluster Analysis R Package for Individual-level Data
    Enders, Catherine
    Hyde, Rebecca J.
    Selvin, Steve
    Metayer, Catherine
    Francis, Stephen Starko
    EPIDEMIOLOGY, 2020, 31 (02) : 224 - 228
  • [32] Handling Missing Values in Interrupted Time Series Analysis of Longitudinal Individual-Level Data
    Bazo-Alvarez, Juan Carlos
    Morris, Tim P.
    Tra My Pham
    Carpenter, James R.
    Petersen, Irene
    CLINICAL EPIDEMIOLOGY, 2020, 12 : 1045 - 1057
  • [33] Inclusive financial development and common prosperity: An empirical analysis using individual-level data
    Luo, Hang
    Yan, Dawei
    ECONOMIC ANALYSIS AND POLICY, 2025, 85 : 261 - 274
  • [34] Re: Re-centering Exposure-Response Curves Without Access to Individual-Level Data
    Basagana, Xavier
    EPIDEMIOLOGY, 2020, 31 (02) : E18 - E19
  • [35] Comparison of Methods to Generalize Randomized Clinical Trial Results Without Individual-Level Data for the Target Population
    Hong, Jin-Liern
    Webster-Clark, Michael
    Funk, Michele Jonsson
    Sturmer, Til
    Dempster, Sara E.
    Cole, Stephen R.
    Herr, Iksha
    LoCasale, Robert
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2019, 188 (02) : 426 - 437
  • [36] Modeling contextual effects using individual-level data and without aggregation: an illustration of multilevel factor analysis (MLFA) with collective efficacy
    Erin C Dunn
    Katherine E Masyn
    William R Johnston
    SV Subramanian
    Population Health Metrics, 13
  • [37] Survival Analysis Without Sharing of Individual Patient Data by Using a Gaussian Copula
    Bonofiglio, Federico
    PHARMACEUTICAL STATISTICS, 2024, 23 (06) : 1031 - 1044
  • [38] Modeling contextual effects using individual-level data and without aggregation: an illustration of multilevel factor analysis (MLFA) with collective efficacy
    Dunn, Erin C.
    Masyn, Katherine E.
    Johnston, William R.
    Subramanian, S. V.
    POPULATION HEALTH METRICS, 2015, 13
  • [39] Income distribution and mortality: Implications from a comparison of individual-level analysis and multilevel analysis with Swedish data
    Henriksson, Goran
    Allebeck, Peter
    Weitoft, Gunilla Ringback
    Thelle, Dag
    SCANDINAVIAN JOURNAL OF PUBLIC HEALTH, 2006, 34 (03) : 287 - 294
  • [40] Enablers of senior entrepreneurial activity across Chile: an analysis using individual-level GEM data
    Torres-Marin, Alfonso Jesus
    Soria-Barreto, Karla
    Leporati, Marcelo
    INTERNATIONAL JOURNAL OF ENTREPRENEURSHIP & SMALL BUSINESS, 2024, 53 (03):