Mediation analysis using incomplete information from publicly available data sources

被引:0
|
作者
Derkach, Andriy [1 ]
Kantor, Elizabeth D. [1 ]
Sampson, Joshua N. [2 ]
Pfeiffer, Ruth M. [2 ]
机构
[1] Mem Sloan Kettering Canc Ctr, Dept Epidemiol & Biostat, 485 Lexington Ave, New York, NY 10017 USA
[2] NCI, Div Canc Epidemiol & Genet, NIH, Rockville, MD USA
基金
美国国家卫生研究院;
关键词
data integration; direct and indirect effects; registry data; summary level information; survey sampling; COLORECTAL-CANCER; REGRESSION-ANALYSIS; INFERENCE; DISPARITIES; BOUNDS; RISK; RACE;
D O I
10.1002/sim.10076
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Our work was motivated by the question whether, and to what extent, well-established risk factors mediate the racial disparity observed for colorectal cancer (CRC) incidence in the United States. Mediation analysis examines the relationships between an exposure, a mediator and an outcome. All available methods require access to a single complete data set with these three variables. However, because population-based studies usually include few non-White participants, these approaches have limited utility in answering our motivating question. Recently, we developed novel methods to integrate several data sets with incomplete information for mediation analysis. These methods have two limitations: (i) they only consider a single mediator and (ii) they require a data set containing individual-level data on the mediator and exposure (and possibly confounders) obtained by independent and identically distributed sampling from the target population. Here, we propose a new method for mediation analysis with several different data sets that accommodates complex survey and registry data, and allows for multiple mediators. The proposed approach yields unbiased causal effects estimates and confidence intervals with nominal coverage in simulations. We apply our method to data from U.S. cancer registries, a U.S.-population-representative survey and summary level odds-ratio estimates, to rigorously evaluate what proportion of the difference in CRC risk between non-Hispanic Whites and Blacks is mediated by three potentially modifiable risk factors (CRC screening history, body mass index, and regular aspirin use).
引用
收藏
页码:2695 / 2712
页数:18
相关论文
共 50 条
  • [1] Acquisition and Processing of Information from Slovak Publicly Available Data
    Hricova, Romana
    Adamcik, Stanislav
    [J]. INDUSTRY 4.0: TRENDS IN MANAGEMENT OF INTELLIGENT MANUFACTURING SYSTEMS, 2019, : 23 - 35
  • [2] Challenges in big data chemistry using publicly available chemical information
    Kim, Sunghwan
    Fu, Gang
    Hahnke, Volker
    Han, Lianyi
    Yu, Bo
    Geer, Lewis
    Shoemaker, Benjamin
    Gindulyte, Asta
    He, Siqian
    Thiessen, Paul
    Bolton, Evan
    Bryant, Stephen
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2015, 250
  • [3] The oil company as publicly available data sources for the experts
    Tokic, I.
    [J]. KEMIJA U INDUSTRIJI-JOURNAL OF CHEMISTS AND CHEMICAL ENGINEERS, 2006, 55 (12): : 532 - 533
  • [4] Accelerating Adverse Outcome Pathway Development Using Publicly Available Data Sources
    Oki N.O.
    Nelms M.D.
    Bell S.M.
    Mortensen H.M.
    Edwards S.W.
    [J]. Current Environmental Health Reports, 2016, 3 (1) : 53 - 63
  • [5] NLP detection of mortality information from publicly available data using deep learning modeling
    Al-Garadi, Mohammed A.
    Matheny, Michael E.
    Desai, Rishi J.
    Khan, Mirza S.
    Wang, Shirley V.
    Maro, Judith C.
    Fuller, Candace C.
    Lin, Kueiyu Joshua
    Hernandez-Munoz, Jose J.
    Kuzucan, Aida
    Wang, Xi
    Whitaker, Jill
    McLemore, Michael F.
    Westerman, Dax
    Osmanski, Joshua T.
    Reeves, Ruth M.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2023, 32 : 425 - 426
  • [6] ASSESSING THE FEASIBILITY OF OBTAINING PRODUCT INGREDIENT DATA FROM PUBLICLY AVAILABLE SOURCES
    BYER, WL
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1982, 22 (04): : 190 - 195
  • [7] ESTIMATE OF UNITED STATES ANTIHYPERTENSIVE MEDICATION COSTS USING DATA FROM THREE PUBLICLY AVAILABLE SOURCES
    Tajeu, Gabriel
    Ruiz-Negron, Natalia
    King, Jordan
    Nelson, Richard
    Moran, Andrew
    Bellows, Brandon
    [J]. MEDICAL DECISION MAKING, 2020, 40 (01) : E282 - E283
  • [8] Data sources publicly available in Brazil for drug utilization research
    Leal, Lisiane F.
    Osorio-de-Castro, Claudia G. S.
    Ferre, Felipe
    Zimmermann, Ivan R.
    Mota, Daniel M.
    Fulone, Izabela
    Fonteles, Marta Mde F.
    Baldoni, Andre O.
    Elseviers, Monique
    Lopes, Luciane C.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2020, 29 : 183 - 184
  • [9] An Investigation of the Impact Publicly Available Accounting Data, Other Publicly Available Information and Management Guidance on Analysts' Forecasts
    Newman, Michael R.
    Gamble, George O.
    Chin, Wynne W.
    Murray, Michael J.
    [J]. NEW PERSPECTIVES IN PARTIAL LEAST SQUARES AND RELATED METHODS, 2013, 56 : 315 - 339
  • [10] INTEGRATING INCOMPLETE DATA FOR MEDIATION ANALYSIS
    Derkach, Andriy
    Sampson, Joshua N.
    Pfeiffer, Ruth M.
    [J]. STATISTICA SINICA, 2024, 34 (02) : 1045 - 1066