A multivariate extreme value theory approach to anomaly clustering and visualization

被引:0
|
作者
Maël Chiapino
Stephan Clémençon
Vincent Feuillard
Anne Sabourin
机构
[1] LTCI,
[2] Télécom Paris,undefined
[3] Institut polytechnique de Paris,undefined
[4] Airbus Central R&T,undefined
[5] AI Research,undefined
来源
Computational Statistics | 2020年 / 35卷
关键词
Anomaly detection; Clustering; Graph-mining; Latent variable analysis; Mixture modelling; Multivariate extreme value theory; Visualization;
D O I
暂无
中图分类号
学科分类号
摘要
In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X=(X1,…,Xd)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }=(X_1,\; \ldots ,\; X_d)$$\end{document} valued in Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^d$$\end{document}, correspond to the simultaneous occurrence of extreme values for certain subgroups α⊂{1,…,d}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \subset \{1,\; \ldots ,\; d \}$$\end{document} of variables Xj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X_j$$\end{document}. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.
引用
收藏
页码:607 / 628
页数:21
相关论文
共 50 条
  • [1] A multivariate extreme value theory approach to anomaly clustering and visualization
    Chiapino, Mael
    Clemencon, Stephan
    Feuillard, Vincent
    Sabourin, Anne
    COMPUTATIONAL STATISTICS, 2020, 35 (02) : 607 - 628
  • [2] Portfolio Tail Risk: A Multivariate Extreme Value Theory Approach
    Bozovic, Milos
    ENTROPY, 2020, 22 (12) : 1 - 20
  • [3] Anomaly Detection in Streams with Extreme Value Theory
    Siffer, Alban
    Fouque, Pierre-Alain
    Termier, Alexandre
    Largouet, Christine
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1067 - 1075
  • [4] Delineating geological anomaly by extreme value theory
    Zuo, Renguang
    Zhao, Pengda
    Xia, Qinglin
    PROCEEDINGS OF THE IAMG '07: GEOMATHEMATICS AND GIS ANALYSIS OF RESOURCES, ENVIRONMENT AND HAZARDS, 2007, : 795 - +
  • [5] RATES OF CONVERGENCE IN MULTIVARIATE EXTREME VALUE THEORY
    OMEY, E
    RACHEV, ST
    JOURNAL OF MULTIVARIATE ANALYSIS, 1991, 38 (01) : 36 - 50
  • [6] Robust Anomaly Detection for Multivariate Data of Spacecraft Through Recurrent Neural Networks and Extreme Value Theory
    Xiang, Gang
    Lin, Ruishi
    IEEE ACCESS, 2021, 9 : 167447 - 167457
  • [7] Extreme value theory for anomaly detection - the GPD classifier
    Vignotto, Edoardo
    Engelke, Sebastian
    EXTREMES, 2020, 23 (04) : 501 - 520
  • [8] Extreme value theory for anomaly detection – the GPD classifier
    Edoardo Vignotto
    Sebastian Engelke
    Extremes, 2020, 23 : 501 - 520
  • [9] ON THE DEPENDENCE FUNCTION OF SIBUYA IN MULTIVARIATE EXTREME VALUE THEORY
    OBRETENOV, A
    JOURNAL OF MULTIVARIATE ANALYSIS, 1991, 36 (01) : 35 - 43
  • [10] Environmental data: multivariate Extreme Value Theory in practice
    Juan, Cai Juan
    Anne-Laure, Fougeres
    Cecile, Mercadier
    JOURNAL OF THE SFDS, 2013, 154 (02): : 178 - 199