Subspace rotations for high-dimensional outlier detection

被引:5
|
作者
Chung, Hee Cheol [1 ]
Ahn, Jeongyoun [2 ]
机构
[1] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
[2] Univ Georgia, Dept Stat, Athens, GA 30602 USA
关键词
Group invariance; High dimension and low sample size data; Left-spherical family; Orthogonal group; Randomization test; Stiefel manifold;
D O I
10.1016/j.jmva.2020.104713
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a new two-stage procedure for detecting multiple outliers when the dimension of the data is much larger than the available sample size. In the first stage, the data are split into two disjoint sets, one containing non-outliers and the other containing the rest of the data that are considered as potential outliers. In the second stage, a series of hypothesis tests is conducted to test the abnormality of each potential outlier. A nonparametric test based on uniform random rotations is adopted for hypothesis testing. The power of the proposed test is studied under a high dimensional asymptotic framework, and its finite-sample exactness is established under mild conditions. Numerical studies based on simulated examples and face recognition data suggest that the proposed approach is superior to the existing methods, especially in terms of false identification of non-outliers. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] A High-dimensional Outlier Detection Algorithm Base on Relevant Subspace
    Gao, Zhipeng
    Zhao, Yang
    Niu, Kun
    Fan, Yidan
    [J]. 2017 IEEE 15TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 15TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 3RD INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS(DASC/PICOM/DATACOM/CYBERSCI, 2017, : 1001 - 1008
  • [2] Outlier detection for high-dimensional data
    Ro, Kwangil
    Zou, Changliang
    Wang, Zhaojun
    Yin, Guosheng
    [J]. BIOMETRIKA, 2015, 102 (03) : 589 - 599
  • [3] Local projections for high-dimensional outlier detection
    Thomas Ortner
    Peter Filzmoser
    Maia Rohm
    Sarka Brodinova
    Christian Breiteneder
    [J]. METRON, 2021, 79 : 189 - 206
  • [4] Outlier detection in high-dimensional regression model
    Wang, Tao
    Li, Zhonghua
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (14) : 6947 - 6958
  • [5] Efficient Outlier Detection for High-Dimensional Data
    Liu, Huawen
    Li, Xuelong
    Li, Jiuyong
    Zhang, Shichao
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (12): : 2451 - 2461
  • [6] Local projections for high-dimensional outlier detection
    Ortner, Thomas
    Filzmoser, Peter
    Rohm, Maia
    Brodinova, Sarka
    Breiteneder, Christian
    [J]. METRON-INTERNATIONAL JOURNAL OF STATISTICS, 2021, 79 (02): : 189 - 206
  • [7] Outlier detection in relevant subspace of high dimensional data
    Chen, Zijun
    Zhang, Liang
    Sun, Dejie
    Liu, Wenyuan
    [J]. ICIC Express Letters, 2011, 5 (06): : 2023 - 2028
  • [8] A geometric framework for outlier detection in high-dimensional data
    Herrmann, Moritz
    Pfisterer, Florian
    Scheipl, Fabian
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 13 (03)
  • [9] An effective and efficient algorithm for high-dimensional outlier detection
    Aggarwal, CC
    Yu, PS
    [J]. VLDB JOURNAL, 2005, 14 (02): : 211 - 221
  • [10] A Comparison of Outlier Detection Techniques for High-Dimensional Data
    Xu, Xiaodan
    Liu, Huawen
    Li, Li
    Yao, Minghai
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 652 - 662