A practical method to account for outliers in simple linear regression using the median of slopes

被引:0
|
作者
Tedeschi, Luis O. [1 ]
Galyean, Michael L. [2 ]
机构
[1] Texas A&M Univ, Dept Anim Sci, College Stn, TX 77843 USA
[2] Texas Tech Univ, Dept Vet Sci, Lubbock, TX 79409 USA
来源
SCIENTIA AGRICOLA | 2024年 / 81卷
关键词
estimation; methods; relationship; robust; statistics; EFFICIENCY; ENERGY; GAIN;
D O I
10.1590/1678-992X-2022-0209
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
The ordinary least squares (OLS) can be affected by errors associated with heteroscedasticity and outliers, and extreme points can influence the regression parameters. Methods based on the median rather than on the mean and variance are more resistant to outliers and extreme points. These methods could be used to obtain regression parameter estimates that reflect more accurately the genuine relationship between the Y and X variables, leading to better identification of outliers and extreme points by comparing the slopes and intercepts of both methods. The Theil-Sen (TS) regression computes all possible pairwise slopes and determines the median of slopes as the regression slope. Here, we illustrated the potential use of TS and frequently used robust regression (RR) techniques to single linear regression using synthetic datasets and a practical problem in animal science. Three synthetic datasets were created assuming the normal distribution of Y and X values: one was free of outliers, while the other two had one or two clusters of outliers but the same X values. The TS, OLS, and RR had nearly identical regression parameter estimates for the dataset without synthetic outliers. However, the intercept and slope estimates by the OLS method differed considerably from the TS and RR methods when one or two clusters of outliers were included. The TS approach could be used to indirectly determine the presence of outliers or extreme points by comparing the 95 % confidence interval of the TS and OLS parameter estimates.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Best linear unbiased estimators for the simple linear regression model using ranked set sampling
    Barreto, MCM
    Barnett, V
    ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 1999, 6 (02) : 119 - 133
  • [42] Best linear unbiased estimators for the simple linear regression model using ranked set sampling
    Maria Cecilia Mendes Barreto
    Vic Barnett
    Environmental and Ecological Statistics, 1999, 6 : 119 - 133
  • [43] Practical Linear Regression-Based Method for Detection and Quantification of Stiction in Control Valves
    Damarla, Seshu K.
    Sun, Xi
    Xu, Fangwei
    Shah, Ashish
    Amalraj, Joseph
    Huang, Biao
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2022, 61 (01) : 502 - 514
  • [44] A simple method for estimating relative risk using logistic regression
    Fredi A Diaz-Quijano
    BMC Medical Research Methodology, 12
  • [45] A simple method for estimating relative risk using logistic regression
    Diaz-Quijano, Fredi A.
    BMC MEDICAL RESEARCH METHODOLOGY, 2012, 12
  • [46] Pitfalls of linear regression for estimating slopes over time and how to avoid them by using linear mixed-effects models
    Janmaat, Cynthia J.
    van Diepen, Merel
    Tsonaka, Roula
    Jager, Kitty J.
    Zoccali, Carmine
    Dekker, Friedo W.
    NEPHROLOGY DIALYSIS TRANSPLANTATION, 2019, 34 (04) : 561 - 566
  • [47] A simple analytical model for bamboo-reinforced slopes using modified Bishop method
    Chakrabortty, Pradipta
    Srivastava, Lokesh Sharan
    Kumar, Pintu
    FRONTIERS IN BUILT ENVIRONMENT, 2023, 9
  • [48] Motion estimation method using multiple linear regression model
    Kim, HS
    Lee, JC
    Park, KT
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING '97, PTS 1-2, 1997, 3024 : 600 - 607
  • [49] Sample size calculation for method validation using linear regression
    Colosimo, E. A.
    Cruz, F. R. B.
    Miranda, J. L. O.
    Van Woensel, T.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2007, 77 (06) : 505 - 516
  • [50] A simplified statistical method for local INR using linear regression
    Keeling, DM
    BRITISH JOURNAL OF HAEMATOLOGY, 1997, 99 (04) : 980 - 980