An overview of modern machine learning methods for effect measure modification analyses in high-dimensional settings

被引:0
|
作者
Cheung, Michael [1 ]
Dimitrova, Anna [1 ]
Benmarhnia, Tarik [1 ]
机构
[1] Univ Calif San Diego, Scripps Inst Oceanog, San Diego, CA USA
关键词
Effect measure modification; Heterogeneity; Machine learning; Generalized random forest; Bayesian additive regression trees; Bayesian causal forest; Metalearner; CAUSAL INFERENCE; CHILD UNDERNUTRITION; ASSOCIATION; REGRESSION; SELECTION;
D O I
10.1016/j.ssmph.2025.101764
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
A primary concern of public health researchers involves identifying and quantifying heterogeneous exposure effects across population subgroups. Understanding the magnitude and direction of these effects on a given scale provides researchers the ability to recommend policy prescriptions and assess the external validity of findings. Traditional methods for effect measure modification analyses require manual model specification that is often impractical or not feasible to conduct in high-dimensional settings. Recent developments in machine learning aim to solve this issue by utilizing data-driven approaches to estimate heterogeneous exposure effects. However, these methods do not directly identify effect modifiers and estimate corresponding subgroup effects. Consequently, additional analysis techniques are required to use these methods in the context of effect measure modification analyses. While no data-driven method or technique can identify effect modifiers and domain expertise is still required, they may serve an important role in the discovery of vulnerable subgroups when prior knowledge is not available. We summarize and provide the intuition behind these machine learning methods and discuss how they may be employed for effect measure modification analyses to serve as a reference for public health researchers. We discuss their implementation in R with annotated syntax and demonstrate their application by assessing the heterogeneous effects of drought on stunting among children in the Demographic and Health survey data set as a case study.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Some recent statistical learning methods for longitudinal high-dimensional data
    Chen, Shuo
    Grant, Edward
    Wu, Tong Tong
    Bowman, F. DuBois
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2014, 6 (01) : 10 - 18
  • [42] Prediction of vancomycin dose on high-dimensional data using machine learning techniques
    Huang, Xiaohui
    Yu, Ze
    Wei, Xin
    Shi, Junfeng
    Wang, Yu
    Wang, Zeyuan
    Chen, Jihui
    Bu, Shuhong
    Li, Lixia
    Gao, Fei
    Zhang, Jian
    Xu, Ajing
    EXPERT REVIEW OF CLINICAL PHARMACOLOGY, 2021, 14 (06) : 761 - 771
  • [43] The Validation and Assessment of Machine Learning: A Game of Prediction from High-Dimensional Data
    Pers, Tune H.
    Albrechtsen, Anders
    Holst, Claus
    Sorensen, Thorkild I. A.
    Gerds, Thomas A.
    PLOS ONE, 2009, 4 (08):
  • [44] Personalized Dynamic Pricing with Machine Learning: High-Dimensional Features and Heterogeneous Elasticity
    Ban, Gah-Yi
    Keskin, N. Bora
    MANAGEMENT SCIENCE, 2021, 67 (09) : 5549 - 5568
  • [45] A Cancer Biologist's Primer on Machine Learning Applications in High-Dimensional Cytometry
    Keyes, Timothy J.
    Domizi, Pablo
    Lo, Yu-Chen
    Nolan, Garry P.
    Davis, Kara L.
    CYTOMETRY PART A, 2020, 97 (08) : 782 - 799
  • [46] A machine learning approach to portfolio pricing and risk management for high-dimensional problems
    Fernandez-Arjona, Lucio
    Filipovic, Damir
    MATHEMATICAL FINANCE, 2022, 32 (04) : 982 - 1019
  • [47] A Sparse Learning Machine for High-Dimensional Data with Application to Microarray Gene Analysis
    Cheng, Qiang
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2010, 7 (04) : 636 - 646
  • [48] Applications of machine learning and high-dimensional visualization in cancer detection, diagnosis, and management
    McCarthy, JF
    Marx, KA
    Hoffman, PE
    Gee, AG
    O'Neil, P
    Ujwal, ML
    Hotchkiss, J
    APPLICATIONS OF BIOINFORMATICS IN CANCER DETECTION, 2004, 1020 : 239 - 262
  • [49] On the learning machine with quaternionic domain neural network and its high-dimensional applications
    Kumar, Sushil
    Tripathi, Bipin Kumar
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (06) : 5189 - 5202
  • [50] Double machine learning for partially linear mediation models with high-dimensional confounders
    Yang, Jichen
    Shao, Yujing
    Liu, Jin
    Wang, Lei
    NEUROCOMPUTING, 2025, 614