DataSHIELD: resolving a conflict in contemporary bioscience-performing a pooled analysis of individual-level data without sharing the data

被引:117
|
作者
Wolfson, Michael [2 ]
Wallace, Susan E. [3 ,4 ]
Masca, Nicholas [1 ,5 ]
Rowe, Geoff [2 ]
Sheehan, Nuala A. [1 ,5 ]
Ferretti, Vincent [4 ,6 ]
LaFlamme, Philippe [4 ,7 ]
Tobin, Martin D. [1 ,5 ]
Macleod, John [8 ]
Little, Julian [4 ,9 ]
Fortier, Isabel [4 ,9 ,10 ]
Knoppers, Bartha M. [3 ,4 ]
Burton, Paul R. [1 ,4 ,5 ,9 ,11 ]
机构
[1] Univ Leicester, Dept Hlth Sci, Leicester LE1 7RH, Leics, England
[2] STAT Canada, Ottawa, ON, Canada
[3] McGill Univ, Fac Med, Dept Human Genet, Ctr Genom & Policy, Montreal, PQ, Canada
[4] Publ Populat Project Genom P3G, Montreal, PQ, Canada
[5] Univ Leicester, Dept Genet, Leicester LE1 7RH, Leics, England
[6] MaRS Ctr, Ontario Inst Canc Res, Toronto, ON, Canada
[7] Genome Quebec Innovat Ctr, Montreal, PQ, Canada
[8] Univ Bristol, Dept Social Med, Bristol, Avon, England
[9] Univ Ottawa, Dept Epidemiol & Community Med, Ottawa, ON, Canada
[10] Univ Montreal, Dept Med Sociale & Prevent, Montreal, PQ, Canada
[11] McGill Univ, Dept Epidemiol Biostat & Occupat Hlth, Montreal, PQ, Canada
基金
英国医学研究理事会; 英国惠康基金;
关键词
Pooling; analysis; meta-analysis; individual-level; study-level; generalized linear model; GLM; ethico-legal; ELSI; identification; disclosure; distributed computing; bioinformatics; information technology; IT; GENOME-WIDE ASSOCIATION; INCOME INEQUALITY; COMMON VARIANTS; LOCI; SUSCEPTIBILITY; EPIDEMIOLOGY; GENE; PRIVACY;
D O I
10.1093/ije/dyq111
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Methods Data aggregation through anonymous summary-statistics from harmonized individual-level databases (DataSHIELD), provides a simple approach to analysing pooled data that circumvents this conflict. This is achieved via parallelized analysis and modern distributed computing and, in one key setting, takes advantage of the properties of the updating algorithm for generalized linear models (GLMs). Results The conceptual use of DataSHIELD is illustrated in two different settings. Conclusions As the study of the aetiological architecture of chronic diseases advances to encompass more complex causal pathways-e.g. to include the joint effects of genes, lifestyle and environment-sample size requirements will increase further and the analysis of pooled individual-level data will become ever more important. An aim of this conceptual article is to encourage others to address the challenges and opportunities that DataSHIELD presents, and to explore potential extensions, for example to its use when different data sources hold different data on the same individuals.
引用
收藏
页码:1372 / 1382
页数:11
相关论文
共 50 条
  • [21] Body Size Indicators and Risk of Gallbladder Cancer: Pooled Analysis of Individual-Level Data from 19 Prospective Cohort Studies
    Campbell, Peter T.
    Newton, Christina C.
    Kitahara, Cari M.
    Patel, Alpa V.
    Hartge, Patricia
    Koshiol, Jill
    McGlynn, Katherine A.
    Adami, Hans-Olov
    de Gonzalez, Amy Berrington
    Freeman, Laura E. Beane
    Bernstein, Leslie
    Buring, Julie E.
    Freedman, Neal D.
    Gao, Yu-Tang
    Giles, Graham G.
    Gunter, Marc J.
    Jenab, Mazda
    Liao, Linda M.
    Milne, Roger L.
    Robien, Kim
    Sandler, Dale P.
    Schairer, Catherine
    Sesso, Howard D.
    Shu, Xiao-Ou
    Weiderpass, Elisabete
    Wolk, Alicja
    Xiang, Yong-Bing
    Zeleniuch-Jacquotte, Anne
    Zheng, Wei
    Gapstur, Susan M.
    CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2017, 26 (04) : 597 - 606
  • [22] Phenotypic heterogeneity in mortality and prognosis of pulmonary alveolar proteinosis: a large-scale, global pooled analysis of individual-level data
    Huang, Junfeng
    Xie, Shuojia
    Gao, Yuewen
    Lin, Zikai
    Xu, Zhe
    Lin, Jinsheng
    He, Linzhi
    Chen, Gengjia
    Zheng, Ziwen
    Xu, Zhixing
    Chen, Jingyan
    Guo, Jiaming
    Wu, Zhile
    Duan, Ailing
    Luo, Weizhan
    Song, Xinyu
    Li, Shiyue
    ORPHANET JOURNAL OF RARE DISEASES, 2025, 20 (01)
  • [23] Does Populism Fuel Affective Polarization? An Individual-Level Panel Data Analysis
    Perez-Rajo, Juan
    POLITICAL STUDIES, 2025, 73 (01) : 29 - 54
  • [24] Integrative analysis of individual-level data and high-dimensional summary statistics
    Fu, Sheng
    Deng, Lu
    Zhang, Han
    Qin, Jing
    Yu, Kai
    BIOINFORMATICS, 2023, 39 (04)
  • [25] From Return of Information to Return of Value: Ethical Considerations when Sharing Individual-Level Research Data
    Nebeker, Camille
    Leow, Alex D.
    Moore, Raeanne C.
    JOURNAL OF ALZHEIMERS DISEASE, 2019, 71 (04) : 1081 - 1088
  • [26] Spatial Analysis for Psychologists: How to Use Individual-Level Data for Research at the Geographically Aggregated Level
    Ebert, Tobias
    Goetz, Friedrich M.
    Mewes, Lars
    Rentfrow, P. Jason
    PSYCHOLOGICAL METHODS, 2023, 28 (05) : 1100 - 1121
  • [27] Separation of individual-level and cluster-level covariate effects in regression analysis of correlated data
    Begg, MD
    Parides, MK
    STATISTICS IN MEDICINE, 2003, 22 (16) : 2591 - 2602
  • [28] Improving Survey Inference Using Administrative Records Without Releasing Individual-Level Continuous Data
    Williams, Sharifa Z.
    Zou, Jungang
    Liu, Yutao
    Si, Yajuan
    Galea, Sandro
    Chen, Qixuan
    STATISTICS IN MEDICINE, 2024, 43 (30) : 5803 - 5813
  • [29] SURVIVAL ANALYSIS WITHOUT SHARING OF INDIVIDUAL PERSON DATA: AN ANTIDOTE TO "DATA AVAILABLE UPON REQUEST"?
    Bonofiglio, F.
    VALUE IN HEALTH, 2023, 26 (06) : S287 - S287
  • [30] Challenges In Performing An Individual Participant-level Data Meta-analysis
    van der Worp, Henk
    Holtman, Gea A.
    Blanker, Marco H.
    EUROPEAN UROLOGY FOCUS, 2023, 9 (05): : 705 - 707