Deeply Learned Generalized Linear Models with Missing Data

被引：0

作者：

Lim, David K. ^{[1
]}

Rashid, Naim U. ^{[1
]}

Oliva, Junier B. ^{[2
]}

Ibrahim, Joseph G. ^{[1
]}

机构：

[1] Univ North Carolina Chapel Hill, Dept Biostat, Chapel Hill, NC 27515 USA

[2] Univ North Carolina Chapel Hill, Dept Comp Sci, Chapel Hill, NC USA

来源：

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS | 2024年 / 33卷 / 02期

关键词：

Deeply learned glm; Missing data; MNAR; Supervised learning; IMPUTATION;

D O I：

10.1080/10618600.2023.2276122

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Deep Learning (DL) methods have dramatically increased in popularity in recent years, with significant growth in their application to various supervised learning problems. However, the greater prevalence and complexity of missing data in such datasets present significant challenges for DL methods. Here, we provide a formal treatment of missing data in the context of deeply learned generalized linear models, a supervised DL architecture for regression and classification problems. We propose a new architecture, dlglm, that is one of the first to be able to flexibly account for both ignorable and non-ignorable patterns of missingness in input features and response at training time. We demonstrate through statistical simulation that our method outperforms existing approaches for supervised learning tasks in the presence of missing not at random (MNAR) missingness. We conclude with a case study of the Bank Marketing dataset from the UCI Machine Learning Repository, in which we predict whether clients subscribed to a product based on phone survey data. Supplementary materials for this article are available online.

引用

页码：638 / 650

页数：13

共 50 条

[1] Local linear regression for generalized linear models with missing data
Wang, CY
Wang, SJ
Gutierrez, RG
Carroll, RJ
ANNALS OF STATISTICS, 1998, 26 (03): : 1028 - 1050
[2] Updatable Estimation in Generalized Linear Models with Missing Data
Zhang, Xianhua
Lin, Lu
Wang, Qihua
STATISTICAL PAPERS, 2025, 66 (01)
[3] Missing-data methods for generalized linear models: A comparative review
Ibrahim, JG
Chen, MH
Lipsitz, SR
Herring, AH
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (469) : 332 - 346
[4] Missing covariates in generalized linear models when the missing data mechanism is non-ignorable
Ibrahim, JG
Lipsitz, SR
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 : 173 - 190
[5] Approximate Conditional Likelihood for Generalized Linear Models with General Missing Data Mechanism
Zhao Jiwei
Shao Jun
JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2017, 30 (01) : 139 - 153
[6] On the identifiability and estimation of generalized linear models with parametric nonignorable missing data mechanism
Cui, Xia
Guo, Jianhua
Yang, Guangren
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 107 : 64 - 80
[7] MISSING DATA IMPUTATION IN THE ELECTRONIC HEALTH RECORD USING DEEPLY LEARNED AUTOENCODERS
Beaulieu-Jones, Brett K.
Moore, Jason H.
PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017, 2017, : 207 - 218
[8] Semiparametric Pseudo-Likelihoods in Generalized Linear Models With Nonignorable Missing Data
Zhao, Jiwei
Shao, Jun
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (512) : 1577 - 1590
[9] Generalized partially linear models with missing covariates
Liang, Hua
JOURNAL OF MULTIVARIATE ANALYSIS, 2008, 99 (05) : 880 - 895
[10] Approximate conditional likelihood for generalized linear models with general missing data mechanism
Jiwei Zhao
Jun Shao
Journal of Systems Science and Complexity, 2017, 30 : 139 - 153

← 1 2 3 4 5 →