Identification and classification of multiple outliers, high leverage points and influential observations in linear regression

被引:14
|
作者
Nurunnabi, A. A. M. [1 ]
Nasser, M. [2 ]
Imon, A. H. M. R. [3 ]
机构
[1] Rajshahi Univ, Dept Stat, SLG, Rajshahi 6205, Bangladesh
[2] Rajshahi Univ, Dept Stat, Rajshahi 6205, Bangladesh
[3] Ball State Univ, Dept Math Sci, Muncie, IN 47306 USA
关键词
generalized residual; group deletion; influence distance; leverage matrix; LRI plot; Mahalanobis distance; masking; outlier; regression diagnostics; robust regression; UNMASKING; MATRIX;
D O I
10.1080/02664763.2015.1070806
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Detection of multiple unusual observations such as outliers, high leverage points and influential observations (IOs) in regression is still a challenging task for statisticians due to the well-known masking and swamping effects. In this paper we introduce a robust influence distance that can identify multiple IOs, and propose a sixfold plotting technique based on the well-known group deletion approach to classify regular observations, outliers, high leverage points and IOs simultaneously in linear regression. Experiments through several well-referred data sets and simulation studies demonstrate that the proposed algorithm performs successfully in the presence of multiple unusual observations and can avoid masking and/or swamping effects.
引用
收藏
页码:509 / 525
页数:17
相关论文
共 50 条
  • [1] Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification
    Baba, Ali Mohammed
    Midi, Habshah
    Adam, Mohd Bakri
    Rahman, Nur Haizum Abd
    [J]. SYMMETRY-BASEL, 2021, 13 (11):
  • [2] Procedures for the identification of multiple influential observations in linear regression
    Nurunnabi, A. A. M.
    Hadi, Ali S.
    Imon, A. H. M. R.
    [J]. JOURNAL OF APPLIED STATISTICS, 2014, 41 (06) : 1315 - 1331
  • [3] Identification of multiple high leverage points in logistic regression
    Imon, A. H. M. Rahmatullah
    Hadi, Ali S.
    [J]. JOURNAL OF APPLIED STATISTICS, 2013, 40 (12) : 2601 - 2616
  • [4] INFLUENTIAL OBSERVATIONS AND OUTLIERS IN REGRESSION
    DRAPER, NR
    JOHN, JA
    [J]. TECHNOMETRICS, 1981, 23 (01) : 21 - 26
  • [5] Fast Improvised Influential Distance for the Identification of Influential Observations in Multiple Linear Regression
    Midi, Habshah
    Sani, Muhammad
    Ismaeel, Shelan Saied
    Arasan, Jayanthi
    [J]. SAINS MALAYSIANA, 2021, 50 (07): : 2085 - 2094
  • [6] IDENTIFICATION OF OUTLIERS AND INFLUENTIAL DATA POINTS IN REGRESSION-ANALYSIS
    LANGEHEINE, R
    [J]. PSYCHOLOGISCHE BEITRAGE, 1986, 28 (3-4): : 384 - 396
  • [7] Fast improvised diagnostic robust measure for the identification of high leverage points in multiple linear regression
    Midi, Habshah
    Ismaeel, Shelan Saied
    [J]. JOURNAL OF STATISTICS & MANAGEMENT SYSTEMS, 2018, 21 (06): : 1003 - 1019
  • [8] A STEPWISE METHOD FOR THE IDENTIFICATION OF MULTIPLE OUTLIERS AND INFLUENTIAL OBSERVATIONS
    FUNG, WK
    [J]. SOUTH AFRICAN STATISTICAL JOURNAL, 1995, 29 (01) : 51 - 64
  • [9] Identification of Influential Points in a Linear Regression Model
    Grosz, Jan
    [J]. STATISTIKA-STATISTICS AND ECONOMY JOURNAL, 2011, 48 (01) : 71 - 77
  • [10] Detecting Multiple Influential Observations in High Dimensional Linear Regression
    Zhao, Junlong
    Zhang, Ying
    Niu, Lu
    [J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, ICIC 2015, PT III, 2015, 9227 : 55 - 64