A new approach for detecting multivariate outliers

被引:10
|
作者
Gao, SG [1 ]
Li, GY
Wang, DQ
机构
[1] Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100080, Peoples R China
[2] Chinese Acad Sci, Beijing Genom Inst, Beijing 100080, Peoples R China
[3] Victoria Univ Wellington, Sch Math & Comp Sci, Wellington, New Zealand
基金
中国国家自然科学基金;
关键词
donor; Mahalanobis distance; MED; outlier; robust distance;
D O I
10.1081/STA-200066315
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article proposes a new procedure named Max-Eigen difference (MED) for identifying outliers in multivariate data sets. Theoretical aspects of the procedure are briefly discussed. The proposed procedure is compared with the Mahalanobis distance (MD) and robust distance (RD) via two examples. It is indicated that the MED works better than MD and is comparable with RD. Finally, this procedure is applied during constructing a quadratic discriminant analysis which is used to splicing sites prediction for DNA sequences. Through the results of rice and human genome data sets, it can be seen that the robustified discriminant provides higher prediction accuracy than the usual discrimination method.
引用
收藏
页码:1857 / 1865
页数:9
相关论文
共 50 条