An overview of modern machine learning methods for effect measure modification analyses in high-dimensional settings

被引:0
|
作者
Cheung, Michael [1 ]
Dimitrova, Anna [1 ]
Benmarhnia, Tarik [1 ]
机构
[1] Univ Calif San Diego, Scripps Inst Oceanog, San Diego, CA USA
关键词
Effect measure modification; Heterogeneity; Machine learning; Generalized random forest; Bayesian additive regression trees; Bayesian causal forest; Metalearner; CAUSAL INFERENCE; CHILD UNDERNUTRITION; ASSOCIATION; REGRESSION; SELECTION;
D O I
10.1016/j.ssmph.2025.101764
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
A primary concern of public health researchers involves identifying and quantifying heterogeneous exposure effects across population subgroups. Understanding the magnitude and direction of these effects on a given scale provides researchers the ability to recommend policy prescriptions and assess the external validity of findings. Traditional methods for effect measure modification analyses require manual model specification that is often impractical or not feasible to conduct in high-dimensional settings. Recent developments in machine learning aim to solve this issue by utilizing data-driven approaches to estimate heterogeneous exposure effects. However, these methods do not directly identify effect modifiers and estimate corresponding subgroup effects. Consequently, additional analysis techniques are required to use these methods in the context of effect measure modification analyses. While no data-driven method or technique can identify effect modifiers and domain expertise is still required, they may serve an important role in the discovery of vulnerable subgroups when prior knowledge is not available. We summarize and provide the intuition behind these machine learning methods and discuss how they may be employed for effect measure modification analyses to serve as a reference for public health researchers. We discuss their implementation in R with annotated syntax and demonstrate their application by assessing the heterogeneous effects of drought on stunting among children in the Demographic and Health survey data set as a case study.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Novel machine learning approach for classification of high-dimensional microarray data
    Musheer, Rabia Aziz
    Verma, C. K.
    Srivastava, Namita
    SOFT COMPUTING, 2019, 23 (24) : 13409 - 13421
  • [32] Efficient sampling of constrained high-dimensional theoretical spaces with machine learning
    Jacob Hollingsworth
    Michael Ratz
    Philip Tanedo
    Daniel Whiteson
    The European Physical Journal C, 2021, 81
  • [33] Robust High-Dimensional Factor Models with Applications to Statistical Machine Learning
    Fan, Jianqing
    Wang, Kaizheng
    Zhong, Yiqiao
    Zhu, Ziwei
    STATISTICAL SCIENCE, 2021, 36 (02) : 303 - 327
  • [34] Two-stage extreme learning machine for high-dimensional data
    Peng Liu
    Yihua Huang
    Lei Meng
    Siyuan Gong
    Guopeng Zhang
    International Journal of Machine Learning and Cybernetics, 2016, 7 : 765 - 772
  • [35] A high-dimensional respiratory motion modeling method based on machine learning
    Zhou, Zeyang
    Jiang, Shan
    Yang, Zhiyong
    Zhou, Ning
    Ma, Shixing
    Li, Yuhua
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 242
  • [36] Efficient sampling of constrained high-dimensional theoretical spaces with machine learning
    Hollingsworth, Jacob
    Ratz, Michael
    Tanedo, Philip
    Whiteson, Daniel
    EUROPEAN PHYSICAL JOURNAL C, 2021, 81 (12):
  • [37] Comparison of Variable Selection Methods for Time-to-Event Data in High-Dimensional Settings
    Gilhodes, Julia
    Dalenc, Florence
    Gal, Jocelyn
    Zemmour, Christophe
    Leconte, Eve
    Boher, Jean-Marie
    Filleron, Thomas
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2020, 2020
  • [38] COMPARATIVE-ANALYSIS OF STATISTICAL PATTERN-RECOGNITION METHODS IN HIGH-DIMENSIONAL SETTINGS
    AEBERHARD, S
    COOMANS, D
    DEVEL, O
    PATTERN RECOGNITION, 1994, 27 (08) : 1065 - 1077
  • [39] Uncertainty Quantification for Modern High-Dimensional Regression via Scalable Bayesian Methods
    Rajaratnam, Bala
    Sparks, Doug
    Khare, Kshitij
    Zhang, Liyuan
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (01) : 174 - 184
  • [40] Communication-efficient Subspace Methods for High-dimensional Federated Learning
    Shi, Zai
    Eryilmaz, Atilla
    2021 17TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2021), 2021, : 543 - 550