Outliers in multilevel data

被引:87
|
作者
Langford, IH [1 ]
Lewis, T
机构
[1] Univ E Anglia, Ctr Social & Econ Res Global Environm, Sch Environm Sci, Norwich NR4 7TJ, Norfolk, England
[2] Univ London, Inst Educ, London WC1N 1AZ, England
关键词
cluster analysis; hierarchical data; influential data points; leverage; multilevel modelling; outlier detection; reduction in deviance; studentized residuals;
D O I
10.1111/1467-985X.00094
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
This paper offers the data analyst a range of practical procedures for dealing with outliers in multilevel data. It first develops several techniques for data exploration for outliers and outlier analysis and then applies these to the detailed analysis of outliers in two large scale multilevel data sets from educational contexts. The techniques include the use of deviance reduction, measures based on residuals, leverage values, hierarchical cluster analysis and a measure called DFITS. Outlier analysis is more complex in a multilevel data set than in, say, a univariate sample or a set of regression data, where the concept of an outlying value is straightforward. In the multilevel situation one has to consider, for example, at what level o(-) levels a particular response is ou!lying, and in respect of which explanatory variables; furthermore, the treatment of a particular response at one level may affect its status or the status of other units at other levels in the model.
引用
收藏
页码:121 / 153
页数:33
相关论文
共 50 条
  • [41] Scan Statistics for Normal Data with Outliers
    Wu, Qianzhu
    Glaz, Joseph
    [J]. METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2021, 23 (01) : 429 - 458
  • [42] Research On Large outliers in the data set data mining algorithm
    Zhang, Jinhai
    [J]. PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND COMPUTING TECHNOLOGY, 2016, 60 : 1743 - 1747
  • [43] Outliers detect methods for time series data
    Liang, T. X.
    Cao, C. X.
    [J]. JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2018, 21 (04): : 927 - 936
  • [44] Improving mining of medical data by outliers prediction
    Podgorelec, V
    Hericko, M
    Rozman, I
    [J]. 18th IEEE Symposium on Computer-Based Medical Systems, Proceedings, 2005, : 91 - 96
  • [45] Detecting and tracking regional outliers in meteorological data
    Lu, Chang-Tien
    Kou, Yufeng
    Zhao, Jiang
    Chen, Li
    [J]. INFORMATION SCIENCES, 2007, 177 (07) : 1609 - 1632
  • [46] OutRank:: ranking outliers in high dimensional data
    Mueller, Emmanuel
    Assent, Ira
    Steinhausen, Uwe
    Seidl, Thomas
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1 AND 2, 2008, : 259 - 262
  • [47] DETECTION OF OUTLIERS IN PEARSON TYPE III DATA
    Spencer, Colleen S.
    McCuen, Richard H.
    [J]. JOURNAL OF HYDROLOGIC ENGINEERING, 1996, 1 (01) : 2 - 10
  • [48] On the treatment of outliers in cognitive and psychomotor test data
    Lance, CE
    Stewart, AM
    Carretta, TR
    [J]. MILITARY PSYCHOLOGY, 1996, 8 (01) : 43 - 58
  • [49] DATA ZOOMING FOR THE DETECTION OF OUTLIERS AND SUBSEQUENCE DISCORDS
    Ameen, Jamal
    Basha, Rawshan
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2012, 8 (04): : 2705 - 2711
  • [50] Principal component analysis for compositional data with outliers
    Filzmoser, Peter
    Hron, Karel
    Reimann, Clemens
    [J]. ENVIRONMETRICS, 2009, 20 (06) : 621 - 632