Uncovering Sociological Effect Heterogeneity Using Tree-Based Machine Learning

被引:21
|
作者
Brand, Jennie E. [1 ,2 ,3 ]
Xu, Jiahui [4 ]
Koch, Bernard
Geraldo, Pablo [1 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA USA
[2] Calif Ctr Populat Res, Los Angeles, CA USA
[3] Ctr Social Stat, Los Angeles, CA USA
[4] Penn State Univ, University Pk, PA 16802 USA
来源
关键词
heterogeneity; causal inference; machine learning; causal trees; decision trees; random forests; PROPENSITY SCORE ESTIMATION; RETURNS; HEALTH; REGRESSION; EDUCATION; OUTCOMES;
D O I
10.1177/0081175021993503
中图分类号
C91 [社会学];
学科分类号
030301 ; 1204 ;
摘要
Individuals do not respond uniformly to treatments, such as events or interventions. Sociologists routinely partition samples into subgroups to explore how the effects of treatments vary by selected covariates, such as race and gender, on the basis of theoretical priors. Data-driven discoveries are also routine, yet the analyses by which sociologists typically go about them are often problematic and seldom move us beyond our biases to explore new meaningful subgroups. Emerging machine learning methods based on decision trees allow researchers to explore sources of variation that they may not have previously considered or envisaged. In this article, the authors use tree-based machine learning, that is, causal trees, to recursively partition the sample to uncover sources of effect heterogeneity. Assessing a central topic in social inequality, college effects on wages, the authors compare what is learned from covariate and propensity score-based partitioning approaches with recursive partitioning based on causal trees. Decision trees, although superseded by forests for estimation, can be used to uncover subpopulations responsive to treatments. Using observational data, the authors expand on the existing causal tree literature by applying leaf-specific effect estimation strategies to adjust for observed confounding, including inverse propensity weighting, nearest neighbor matching, and doubly robust causal forests. We also assess localized balance metrics and sensitivity analyses to address the possibility of differential imbalance and unobserved confounding. The authors encourage researchers to follow similar data exploration practices in their work on variation in sociological effects and offer a straightforward framework by which to do so.
引用
收藏
页码:189 / 223
页数:35
相关论文
共 50 条
  • [1] Land subsidence modelling using tree-based machine learning algorithms
    Rahmati, Omid
    Falah, Fatemeh
    Naghibi, Seyed Amir
    Biggs, Trent
    Soltani, Milad
    Deo, Ravinesh C.
    Cerda, Artemi
    Mohammadi, Farnoush
    Dieu Tien Bui
    [J]. SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 672 : 239 - 252
  • [2] Malware Detection Method using Tree-based Machine Learning Algorithms
    Okada, Satoshi
    Matsuda, Wataru
    Fujimoto, Mariko
    Mitsunaga, Takuho
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING (ICOCO), 2021, : 103 - 108
  • [3] Protein pKa Prediction by Tree-Based Machine Learning
    Chen, Ada Y.
    Lee, Juyong
    Damjanovic, Ana
    Brooks, Bernard R.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (04) : 2673 - 2686
  • [4] Runtime Optimizations for Tree-based Machine Learning Models
    Asadi, Nima
    Lin, Jimmy
    de Vries, Arjen P.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) : 2281 - 2292
  • [5] Tree-based interpretable machine learning of the thermodynamic phases
    Yang, Jintao
    Cao, Junpeng
    [J]. PHYSICS LETTERS A, 2021, 412
  • [6] Tree-based Machine Learning Methods for Survey Research
    Kern, Christoph
    Klausch, Thomas
    Kreuter, Frauke
    [J]. SURVEY RESEARCH METHODS, 2019, 13 (01): : 73 - 93
  • [7] Cosmic string detection with tree-based machine learning
    Sadr, A. Vafaei
    Farhang, M.
    Movahed, S. M. S.
    Bassett, B.
    Kunz, M.
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2018, 478 (01) : 1132 - 1140
  • [8] Flood susceptibility prediction using tree-based machine learning models in the GBA
    Lyu, Hai -Min
    Yin, Zhen-Yu
    [J]. SUSTAINABLE CITIES AND SOCIETY, 2023, 97
  • [9] Malware Classification of Portable Executables using Tree-Based Ensemble Machine Learning
    Atluri, Venkata
    [J]. 2019 IEEE SOUTHEASTCON, 2019,
  • [10] Classifying Familial Hypercholesterolaemia: A Tree-based Machine Learning Approach
    Rosli, Marshima Mohd
    Edward, Jafhate
    Onn, Marcella
    Chua, Yung-An
    Kasim, Noor Alicezah Mohd
    Nawawi, Hapizah
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (09) : 66 - 73