Longitudinal Nonresponse Prediction with Time Series Machine Learning

被引:0
|
作者
Collins, John [1 ]
Kern, Christoph [2 ]
机构
[1] Univ Mannheim, Mannheimer Zent Europa Sozialforschung, Mannheim, Germany
[2] Ludwig Maximilian Univ Munich, Social Data Sci & AI Lab, Munich, Germany
关键词
Catch22; Machine learning; Nonresponse; Panel attrition; Recurrent neural network; Time series classification; PANEL ATTRITION; HOUSEHOLD; BIAS;
D O I
10.1093/jssam/smae037
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Panel surveys are an important tool for social science researchers, but nonresponse in any panel wave can significantly reduce data quality. Panel managers then attempt to identify participants who may be at risk of not participating using predictive models to target interventions before data collection through adaptive designs. Previous research has shown that these predictions can be improved by accounting for a sample member's behavior in past waves. These past behaviors are often operationalized through rolling average variables that aggregate information over the past two, three, or all waves, such as each participant's nonresponse rate. However, it is possible that this approach is too simple. In this paper, we evaluate models that account for more nuanced temporal dependency, namely recurrent neural networks (RNNs) and feature-, interval-, and kernel-based time series classification techniques. We compare these novel techniques' performances to more traditional logistic regression and tree-based models in predicting future panel survey nonresponse. We apply these algorithms to predict nonresponse in the GESIS Panel, a large-scale, probability-based German longitudinal study, for surveys conducted between 2013 and 2021. Our findings show that RNNs perform similar to tree-based approaches, but the RNNs do not require the analyst to create rolling average variables. More complex feature-, interval-, and kernel-based techniques are not more effective at classifying future respondents and nonrespondents than RNNs or traditional logistic regression or tree-based methods. We find that predicting nonresponse of newly recruited participants is a more difficult task, and basic RNN models and penalized logistic regression performed best in this situation. We conclude that RNNs may be better at classifying future response propensity than traditional logistic regression and tree-based approaches when the association between time-varying characteristics and survey participation is complex but did not do so in the current analysis when a traditional rolling averages approach yielded comparable results.
引用
收藏
页数:32
相关论文
共 50 条
  • [21] Time Series Prediction based on Ensemble Fuzzy Extreme Learning Machine
    Wang, Hong
    Li, Lei
    Fan, Wei
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION (ICIA), 2016, : 2001 - 2005
  • [22] Prediction of hydrological time-series using extreme learning machine
    Atiquzzaman, Md
    Kandasamy, Jaya
    [J]. JOURNAL OF HYDROINFORMATICS, 2016, 18 (02) : 345 - 353
  • [23] Chaotic time series prediction based on robust extreme learning machine
    Shen Li-Hua
    Chen Ji-Hong
    Zeng Zhi-Gang
    Jin Jian
    [J]. ACTA PHYSICA SINICA, 2018, 67 (03)
  • [24] AN ADAPTIVE ENSEMBLE MODEL OF EXTREME LEARNING MACHINE FOR TIME SERIES PREDICTION
    Wang, Hong
    Fan, Wei
    Sun, Fengwei
    Qian, Xiaojian
    [J]. 2015 12TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2015, : 80 - 85
  • [25] Extending machine learning prediction capabilities by explainable AI in financial time series prediction
    Celik, Taha Bugra
    Ican, Ozgur
    Bulut, Elif
    [J]. APPLIED SOFT COMPUTING, 2023, 132
  • [26] A comprehensive evaluation of statistical, machine learning and deep learning models for time series prediction
    Xuan, Ang
    Yin, Mengmeng
    Li, Yupei
    Chen, Xiyu
    Ma, Zhenliang
    [J]. 2022 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MACHINE LEARNING APPLICATIONS (CDMA 2022), 2022, : 55 - 60
  • [27] PREDICTION ENHANCEMENT OF MACHINE LEARNING USING TIME SERIES MODELING IN GAS TURBINES
    Goyal, Vipul
    Xu, Mengyu
    Kapat, Jayanta
    Vesely, Ladislav
    [J]. PROCEEDINGS OF ASME TURBO EXPO 2021: TURBOMACHINERY TECHNICAL CONFERENCE AND EXPOSITION, VOL 4, 2021,
  • [28] Machine Learning-Based Time Series Prediction at Brazilian Stocks Exchange
    dos Santos Gularte, Ana Paula
    Filho, Danusio Gadelha Guimaraes
    de Oliveira Torres, Gabriel
    da Silva, Thiago Carvalho Nunes
    Curtis, Vitor Venceslau
    [J]. COMPUTATIONAL ECONOMICS, 2023, 64 (4) : 2477 - 2508
  • [29] Hydrological time series prediction by extreme machine learning and sparrow search algorithm
    Feng, Bao-fei
    Xu, Yin-shan
    Zhang, Tao
    Zhang, Xiao
    [J]. WATER SUPPLY, 2022, 22 (03) : 3143 - 3157
  • [30] Multivariate chaotic time series prediction based on weighted extreme learning machine
    Han, Min
    Wang, Xin-Ying
    [J]. Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2013, 30 (11): : 1467 - 1472