Causal Feature Selection in the Presence of Sample Selection Bias

被引:3
|
作者
Yang, Shuai [1 ]
Guo, Xianjie [2 ,3 ]
Yu, Kui [2 ,3 ]
Huang, Xiaoling [2 ,3 ]
Jiang, Tingting [1 ]
He, Jin [1 ]
Gu, Lichuan [1 ]
机构
[1] Anhui Agr Univ, Sch Informat & Comp, Hefei 230036, Peoples R China
[2] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Minist Educ, Hefei 230601, Peoples R China
[3] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Causal feature selection; sample selection bias; causal effect; EFFICIENT;
D O I
10.1145/3604809
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Almost all existing causal feature selection methods are proposed without considering the problem of sample selection bias. However, in practice, as data-gathering process cannot be fully controlled, sample selection bias often occurs, leading to spurious correlations between features and the class variable, which seriously deteriorates the performance of those existing methods. In this article, we study the problem of causal feature selection under sample selection bias and propose a novel Progressive Causal Feature Selection (PCFS) algorithm which has three phases. First, PCFS learns the sample weights to balance the treated group and control group distributions corresponding to each feature for removing spurious correlations. Second, based on the sample weights, PCFS uses a weighted cross-entropymodel to estimate the causal effect of each feature and removes some irrelevant features from the confounder set. Third, PCFS progressively repeats the first two phases to remove more irrelevant features and finally obtains a causal feature set. Using synthetic and real-world datasets, the experiments have validated the effectiveness of PCFS, in comparison with several state-of-the-art classical and causal feature selection methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] SAMPLE SELECTION BIAS AS A SPECIFICATION ERROR
    HECKMAN, JJ
    [J]. ECONOMETRICA, 1979, 47 (01) : 153 - 161
  • [22] YTS, EMPLOYMENT, AND SAMPLE SELECTION BIAS
    OHIGGINS, N
    [J]. OXFORD ECONOMIC PAPERS-NEW SERIES, 1994, 46 (04): : 605 - 628
  • [23] Sample Selection Bias Correction Theory
    Cortes, Corinna
    Mohri, Mehryar
    Riley, Michael
    Rostamizadeh, Afshin
    [J]. ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 38 - +
  • [24] Military Technology and Sample Selection Bias
    Fourie, Johan
    Inwood, Kris
    Mariotti, Martine
    [J]. SOCIAL SCIENCE HISTORY, 2020, 44 (03) : 485 - 500
  • [25] SAMPLE SELECTION BIAS IN CLINICAL RESEARCH
    SWENSON, WM
    [J]. PSYCHOSOMATICS, 1980, 21 (04) : 291 - 292
  • [26] Recovering Causal Effects from Selection Bias
    Bareinboim, Elias
    Tian, Jin
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3475 - 3481
  • [27] On the Stability of Feature Selection in the Presence of Feature Correlations
    Sechidis, Konstantinos
    Papangelou, Konstantinos
    Nogueira, Sarah
    Weatherall, James
    Brown, Gavin
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 11906 : 327 - 342
  • [28] A Unified View of Causal and Non-causal Feature Selection
    Yu, Kui
    Liu, Lin
    Li, Jiuyong
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2021, 15 (04)
  • [29] Feature Selection with Biased Sample Distributions
    Kamal, Abu H. M.
    Zhu, Xingquan
    Pandya, Abhijit
    Hsu, Sam
    [J]. PROCEEDINGS OF THE 2009 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 23 - 28
  • [30] Causal Feature Selection for Individual Characteristics Prediction
    Ding, Tao
    Zhang, Cheng
    Bos, Maarten
    [J]. 2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 540 - 547