Causal inference and the data-fusion problem

被引:372
|
作者
Bareinboim, Elias [1 ,2 ]
Pearl, Judea [1 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
causal inference; counterfactuals; external validity; selection bias; transportability; SAMPLE SELECTION; PROPENSITY SCORE; DIAGRAMS;
D O I
10.1073/pnas.1510507113
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We review concepts, principles, and tools that unify current approaches to causal analysis and attend to new challenges presented by big data. In particular, we address the problem of data fusion-piecing together multiple datasets collected under heterogeneous conditions (i.e., different populations, regimes, and sampling methods) to obtain valid answers to queries of interest. The availability of multiple heterogeneous datasets presents new opportunities to big data analysts, because the knowledge that can be acquired from combined data would not be possible from any individual source alone. However, the biases that emerge in heterogeneous environments require new analytical tools. Some of these biases, including confounding, sampling selection, and cross-population biases, have been addressed in isolation, largely in restricted parametric models. We here present a general, nonparametric framework for handling these biases and, ultimately, a theoretical solution to the problem of data fusion in causal inference tasks.
引用
收藏
页码:7345 / 7352
页数:8
相关论文
共 50 条
  • [21] Causal inference and data fusion in econometrics (Mar, utad008, 2023)
    Hunermund, Paul
    Bareinboim, Elias
    [J]. ECONOMETRICS JOURNAL, 2024,
  • [22] MULTILEVEL DATA-FUSION FOR DETECTION OF MOVING-OBJECTS
    GIUSTO, DD
    REGAZZONI, CS
    VERNAZZA, G
    [J]. 1989 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-3: CONFERENCE PROCEEDINGS, 1989, : 931 - 933
  • [23] Data-fusion approach to sea-water monitoring
    Giusto, Daniele D.
    Parodi, Luca
    Vernazza, Gianni
    [J]. Digest - International Geoscience and Remote Sensing Symposium (IGARSS), 1989, 2 : 697 - 700
  • [24] Data integration in causal inference
    Shi, Xu
    Pan, Ziyang
    Miao, Wang
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2023, 15 (01)
  • [25] Causal inference and observational data
    Ivan Olier
    Yiqiang Zhan
    Xiaoyu Liang
    Victor Volovici
    [J]. BMC Medical Research Methodology, 23
  • [26] Causal inference with observational data
    Nichols, Austin
    [J]. STATA JOURNAL, 2007, 7 (04): : 507 - 541
  • [27] Causal inference and observational data
    Olier, Ivan
    Zhan, Yiqiang
    Liang, Xiaoyu
    Volovici, Victor
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2023, 23 (01)
  • [28] A Multisensor Data-Fusion Approach for ADL and Fall classification
    Ando, Bruno
    Baglio, Salvatore
    Lombardo, Cristian Orazio
    Marletta, Vincenzo
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2016, 65 (09) : 1960 - 1967
  • [29] CAUSAL INFERENCE AS A PREDICTION-PROBLEM
    BERK, RA
    [J]. CRIME AND JUSTICE-A REVIEW OF RESEARCH, 1987, 9 : 183 - 200
  • [30] The problem of causal inference in mediator analysis
    Klein, Andreas
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2008, 43 (3-4) : 734 - 734