Estimating Multilevel Models on Data Streams

被引:0
|
作者
L. Ippel
M. C. Kaptein
J. K. Vermunt
机构
[1] Maastricht University,Institute of Data Science
[2] Tilburg University,undefined
来源
Psychometrika | 2019年 / 84卷
关键词
Data streams; expectation maximization algorithm; multilevel models; machine (online) learning; SEMA; nested data;
D O I
暂无
中图分类号
学科分类号
摘要
Social scientists are often faced with data that have a nested structure: pupils are nested within schools, employees are nested within companies, or repeated measurements are nested within individuals. Nested data are typically analyzed using multilevel models. However, when data sets are extremely large or when new data continuously augment the data set, estimating multilevel models can be challenging: the current algorithms used to fit multilevel models repeatedly revisit all data points and end up consuming much time and computer memory. This is especially troublesome when predictions are needed in real time and observations keep streaming in. We address this problem by introducing the Streaming Expectation Maximization Approximation (SEMA) algorithm for fitting multilevel models online (or “row-by-row”). In an extensive simulation study, we demonstrate the performance of SEMA compared to traditional methods of fitting multilevel models. Next, SEMA is used to analyze an empirical data stream. The accuracy of SEMA is competitive to current state-of-the-art methods while being orders of magnitude faster.
引用
收藏
页码:41 / 64
页数:23
相关论文
共 50 条
  • [1] Estimating Multilevel Models on Data Streams
    Ippel, L.
    Kaptein, M. C.
    Vermunt, J. K.
    PSYCHOMETRIKA, 2019, 84 (01) : 41 - 64
  • [2] Estimating random-intercept models on data streams
    Ippel, L.
    Kaptein, M. C.
    Vermunt, J. K.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2016, 104 : 169 - 182
  • [3] Estimating missing data in data streams
    Jiang, Nan
    Gruenwald, Le
    ADVANCES IN DATABASES: CONCEPTS, SYSTEMS AND APPLICATIONS, 2007, 4443 : 981 - +
  • [4] Fast meta-analytic approximations for relational event models: applications to data streams and multilevel data
    Vieira, Fabio
    Leenders, Roger
    Mulder, Joris
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2024, 7 (02): : 1823 - 1859
  • [5] Estimating clustering indexes in data streams
    Buriol, Luciana S.
    Frahling, Gereon
    Leonardi, Stefano
    Sohler, Christian
    ALGORITHMS - ESA 2007, PROCEEDINGS, 2007, 4698 : 618 - +
  • [6] Estimating Mutual Information on Data Streams
    Keller, Fabian
    Mueller, Emmanuel
    Boehm, Klemens
    PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2015,
  • [7] On estimating frequency moments of data streams
    Ganguly, Sumit
    Cormode, Graham
    APPROXIMATION, RANDOMIZATION, AND COMBINATORIAL OPTIMIZATION: ALGORITHMS AND TECHNIQUES, 2007, 4627 : 479 - +
  • [8] Estimating multilevel linear models as structural equation models
    Bauer, DJ
    JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2003, 28 (02) : 135 - 167
  • [9] Estimating entropy over data streams
    Bhuvanagiri, Lakshminath
    Canguly, Sumit
    ALGORITHMS - ESA 2006, PROCEEDINGS, 2006, 4168 : 148 - 159