Longitudinal Nonresponse Prediction with Time Series Machine Learning

被引:0
|
作者
Collins, John [1 ]
Kern, Christoph [2 ]
机构
[1] Univ Mannheim, Mannheimer Zent Europa Sozialforschung, Mannheim, Germany
[2] Ludwig Maximilian Univ Munich, Social Data Sci & AI Lab, Munich, Germany
关键词
Catch22; Machine learning; Nonresponse; Panel attrition; Recurrent neural network; Time series classification; PANEL ATTRITION; HOUSEHOLD; BIAS;
D O I
10.1093/jssam/smae037
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Panel surveys are an important tool for social science researchers, but nonresponse in any panel wave can significantly reduce data quality. Panel managers then attempt to identify participants who may be at risk of not participating using predictive models to target interventions before data collection through adaptive designs. Previous research has shown that these predictions can be improved by accounting for a sample member's behavior in past waves. These past behaviors are often operationalized through rolling average variables that aggregate information over the past two, three, or all waves, such as each participant's nonresponse rate. However, it is possible that this approach is too simple. In this paper, we evaluate models that account for more nuanced temporal dependency, namely recurrent neural networks (RNNs) and feature-, interval-, and kernel-based time series classification techniques. We compare these novel techniques' performances to more traditional logistic regression and tree-based models in predicting future panel survey nonresponse. We apply these algorithms to predict nonresponse in the GESIS Panel, a large-scale, probability-based German longitudinal study, for surveys conducted between 2013 and 2021. Our findings show that RNNs perform similar to tree-based approaches, but the RNNs do not require the analyst to create rolling average variables. More complex feature-, interval-, and kernel-based techniques are not more effective at classifying future respondents and nonrespondents than RNNs or traditional logistic regression or tree-based methods. We find that predicting nonresponse of newly recruited participants is a more difficult task, and basic RNN models and penalized logistic regression performed best in this situation. We conclude that RNNs may be better at classifying future response propensity than traditional logistic regression and tree-based approaches when the association between time-varying characteristics and survey participation is complex but did not do so in the current analysis when a traditional rolling averages approach yielded comparable results.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Time Series Prediction Based on Machine Learning
    Jiang, Q. Y.
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON ELECTRICAL, AUTOMATION AND MECHANICAL ENGINEERING (EAME 2015), 2015, 13 : 128 - 129
  • [2] Automated Machine Learning for Time Series Prediction
    da Silva, Felipe Rooke
    Vieira, Alex Borges
    Bernardino, Heder Soares
    Alencar, Victor Aquiles
    Pessamilio, Lucas Ribeiro
    Correa Barbosa, Helio Jose
    [J]. 2022 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2022,
  • [3] Machine Learning for Real Estate Time Series Prediction
    Habbab, Fatim Z.
    Kampouridis, Michael
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2022, 2024, 1454 : 271 - 282
  • [4] MACHINE LEARNING ALGORITHMS FOR TIME-SERIES FORECASTINGRAINFALL PREDICTION
    Regulagadda, Rama Krishna
    Kumar, P. Om Sai
    Yamini, P.
    Niharika, K.
    Madhavi, Kilaru
    [J]. INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (04) : 1328 - 1338
  • [5] On the Optimization of Machine Learning Techniques for Chaotic Time Series Prediction
    Maritza Gonzalez-Zapata, Astrid
    Tlelo-Cuautle, Esteban
    Cruz-Vega, Israel
    [J]. ELECTRONICS, 2022, 11 (21)
  • [6] Sensitive time series prediction using extreme learning machine
    Hong-Bo Wang
    Xi Liu
    Peng Song
    Xu-Yan Tu
    [J]. International Journal of Machine Learning and Cybernetics, 2019, 10 : 3371 - 3386
  • [7] Prediction of Unemployment Rates with Time Series and Machine Learning Techniques
    Christos Katris
    [J]. Computational Economics, 2020, 55 : 673 - 706
  • [8] Sensitive time series prediction using extreme learning machine
    Wang, Hong-Bo
    Liu, Xi
    Song, Peng
    Tu, Xu-Yan
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (12) : 3371 - 3386
  • [9] Prediction of Unemployment Rates with Time Series and Machine Learning Techniques
    Katris, Christos
    [J]. COMPUTATIONAL ECONOMICS, 2020, 55 (02) : 673 - 706
  • [10] Failure prediction using machine learning and time series in optical network
    Wang, Zhilong
    Zhang, Min
    Wang, Danshi
    Song, Chuang
    Liu, Min
    Li, Jin
    Lou, Liqi
    Liu, Zhuo
    [J]. OPTICS EXPRESS, 2017, 25 (16): : 18553 - 18565