Causal Feature Selection in the Presence of Sample Selection Bias

被引:3
|
作者
Yang, Shuai [1 ]
Guo, Xianjie [2 ,3 ]
Yu, Kui [2 ,3 ]
Huang, Xiaoling [2 ,3 ]
Jiang, Tingting [1 ]
He, Jin [1 ]
Gu, Lichuan [1 ]
机构
[1] Anhui Agr Univ, Sch Informat & Comp, Hefei 230036, Peoples R China
[2] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Minist Educ, Hefei 230601, Peoples R China
[3] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Causal feature selection; sample selection bias; causal effect; EFFICIENT;
D O I
10.1145/3604809
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Almost all existing causal feature selection methods are proposed without considering the problem of sample selection bias. However, in practice, as data-gathering process cannot be fully controlled, sample selection bias often occurs, leading to spurious correlations between features and the class variable, which seriously deteriorates the performance of those existing methods. In this article, we study the problem of causal feature selection under sample selection bias and propose a novel Progressive Causal Feature Selection (PCFS) algorithm which has three phases. First, PCFS learns the sample weights to balance the treated group and control group distributions corresponding to each feature for removing spurious correlations. Second, based on the sample weights, PCFS uses a weighted cross-entropymodel to estimate the causal effect of each feature and removes some irrelevant features from the confounder set. Third, PCFS progressively repeats the first two phases to remove more irrelevant features and finally obtains a causal feature set. Using synthetic and real-world datasets, the experiments have validated the effectiveness of PCFS, in comparison with several state-of-the-art classical and causal feature selection methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Heterogeneous Causal Effects and Sample Selection Bias
    Breen, Richard
    Choi, Seongsoo
    Holm, Anders
    [J]. SOCIOLOGICAL SCIENCE, 2015, 2 : 351 - 369
  • [2] Identification of Causal Effects in the Presence of Selection Bias
    Correa, Juan D.
    Tian, Jin
    Bareinboim, Elias
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 2744 - 2751
  • [3] Sample selection bias in evaluation of prediction performance of causal models
    Long, James P.
    Ha, Min Jin
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2022, 15 (01) : 5 - 14
  • [4] Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias
    Forre, Patrick
    Mooij, Joris M.
    [J]. 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 71 - 80
  • [5] Recursive Causal Structure Learning in the Presence of Latent Variables and Selection Bias
    Akbari, Sina
    Mokhtarian, Ehsan
    Ghassami, AmirEmad
    Kiyavash, Negar
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Estimating causal contrasts involving intermediate variables in the presence of selection bias
    Valeri, Linda
    Coull, Brent A.
    [J]. STATISTICS IN MEDICINE, 2016, 35 (26) : 4779 - 4793
  • [7] Iterative Causal Discovery in the Possible Presence of Latent Confounders and Selection Bias
    Rohekar, Raanan Y.
    Nisimov, Shami
    Gurwicz, Yaniv
    Novik, Gal
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] TESTING FOR SAMPLE SELECTION BIAS
    MELINO, A
    [J]. REVIEW OF ECONOMIC STUDIES, 1982, 49 (01): : 151 - 153
  • [9] MODELS FOR SAMPLE SELECTION BIAS
    WINSHIP, C
    MARE, RD
    [J]. ANNUAL REVIEW OF SOCIOLOGY, 1992, 18 : 327 - 350
  • [10] On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias
    Zhang, Jiji
    [J]. ARTIFICIAL INTELLIGENCE, 2008, 172 (16-17) : 1873 - 1896