Ultra-high dimensional variable selection for doubly robust causal inference

被引:9
|
作者
Tang, Dingke [1 ]
Kong, Dehan [1 ]
Pan, Wenliang [2 ]
Wang, Linbo [1 ]
机构
[1] Univ Toronto, Dept Stat Sci, Toronto, ON M5S 3G3, Canada
[2] Sun Yat Sen Univ, Sch Math, Dept Stat Sci, Guangzhou, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Alzheimer's disease; average causal effect; ball covariance; confounder selection; variable screening; PROPENSITY SCORE; ALZHEIMERS-DISEASE; MODEL SELECTION; ADAPTIVE LASSO; EFFICIENT; TAU; BIOMARKERS;
D O I
10.1111/biom.13625
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Causal inference has been increasingly reliant on observational studies with rich covariate information. To build tractable causal procedures, such as the doubly robust estimators, it is imperative to first extract important features from high or even ultra-high dimensional data. In this paper, we propose causal ball screening for confounder selection from modern ultra-high dimensional data sets. Unlike the familiar task of variable selection for prediction modeling, our confounder selection procedure aims to control for confounding while improving efficiency in the resulting causal effect estimate. Previous empirical and theoretical studies suggest excluding causes of the treatment that are not confounders. Motivated by these results, our goal is to keep all the predictors of the outcome in both the propensity score and outcome regression models. A distinctive feature of our proposal is that we use an outcome model-free procedure for propensity score model selection, thereby maintaining double robustness in the resulting causal effect estimator. Our theoretical analyses show that the proposed procedure enjoys a number of properties, including model selection consistency and pointwise normality. Synthetic and real data analysis show that our proposal performs favorably with existing methods in a range of realistic settings. Data used in preparation of this paper were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.
引用
收藏
页码:903 / 914
页数:12
相关论文
共 50 条
  • [41] A variable oscillator for ultra-high frequency measurements
    King, R
    REVIEW OF SCIENTIFIC INSTRUMENTS, 1939, 10 (11): : 325 - 331
  • [42] Causal inference accounting for unobserved confounding after outcome regression and doubly robust estimation
    Genback, Minna
    de Luna, Xavier
    BIOMETRICS, 2019, 75 (02) : 506 - 515
  • [43] Doubly Robust Triple Cross-Fit Estimation for Causal Inference with Imaging Data
    Ke, Da
    Zhou, Xiaoxiao
    Yang, Qinglong
    Song, Xinyuan
    STATISTICS IN BIOSCIENCES, 2024,
  • [44] Quantile-adaptive variable screening in ultra-high dimensional varying coefficient models
    Zhang, Junying
    Zhang, Riquan
    Lu, Zhiping
    JOURNAL OF APPLIED STATISTICS, 2016, 43 (04) : 643 - 654
  • [46] Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data
    Xie, Jinhan
    Lin, Yuanyuan
    Yan, Xiaodong
    Tang, Niansheng
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 747 - 760
  • [47] PRIOR KNOWLEDGE GUIDED ULTRA-HIGH DIMENSIONAL VARIABLE SCREENING WITH APPLICATION TO NEUROIMAGING DATA
    He, Jie
    Kang, Jian
    STATISTICA SINICA, 2022, 32 : 2095 - 2117
  • [48] Dynamic artificial immune system with variable selection based on causal inference
    Shu, Yidan
    Zhao, Jinsong
    12TH INTERNATIONAL SYMPOSIUM ON PROCESS SYSTEMS ENGINEERING (PSE) AND 25TH EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING (ESCAPE), PT B, 2015, 37 : 1793 - 1798
  • [49] Doubly robust inference when combining probability and non-probability samples with high dimensional data
    Yang, Shu
    Kim, Jae Kwang
    Song, Rui
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (02) : 445 - 465
  • [50] BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory
    Aghazadeh, Amirali
    Gupta, Vipul
    DeWeese, Alex
    Koyluoglu, O. Ozan
    Ramchandran, Kannan
    MATHEMATICAL AND SCIENTIFIC MACHINE LEARNING, VOL 145, 2021, 145 : 75 - 92