A GPU Algorithm for Detecting Contextual Outliers in Multiple Concurrent Data Streams

被引:5
|
作者
Borah, Abinash [1 ]
Gruenwald, Le [1 ]
Leal, Eleazar [2 ]
Panjei, Egawati [1 ]
机构
[1] Univ Oklahoma, Sch Comp Sci, Norman, OK 73019 USA
[2] Univ Minnesota, Dept Comp Sci, Duluth, MN 55812 USA
来源
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2021年
基金
美国国家科学基金会;
关键词
Data Stream; Outlier Detection; Contextual Outlier; GPU;
D O I
10.1109/BigData52589.2021.9671460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data stream is an infinite sequence of data points generated from a source continuously at a fast rate, which is characterized by the transiency of the data points, the temporal relationship among the data points, concept drift, and multi-dimensionality of data points. Outlier detection in data streams thus needs to deal with the characteristics of Big Data applications such as volume, velocity, and variety. The problem of detecting outliers in multiple concurrent data streams introduces additional challenges to the problem. In this paper, we propose a parallel outlier detection technique CODS to detect Contextual Outliers in multiple concurrent independent multi-dimensional Data Streams using a Graphics Processing Unit (GPU). The proposed algorithm addresses all the aforesaid characteristics of data streams. A set of experiments demonstrates reasonable outlier detection accuracy and scalability of CODS with the number of data streams.
引用
收藏
页码:2737 / 2742
页数:6
相关论文
共 50 条
  • [31] Detecting multivariate outliers in artefact compositional data
    Baxter, MJ
    ARCHAEOMETRY, 1999, 41 : 321 - 338
  • [32] Detecting and classifying outliers in big functional data
    Oluwasegun Taiwo Ojo
    Antonio Fernández Anta
    Rosa E. Lillo
    Carlo Sguera
    Advances in Data Analysis and Classification, 2022, 16 : 725 - 760
  • [33] DETECTING OUTLIERS IN TIME-SERIES DATA
    CHERNICK, MR
    DOWNING, DJ
    PIKE, DH
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1982, 77 (380) : 743 - 747
  • [34] Detecting and classifying outliers in big functional data
    Taiwo Ojo, Oluwasegun
    Fernandez Anta, Antonio
    Lillo, Rosa E.
    Sguera, Carlo
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2022, 16 (03) : 725 - 760
  • [35] A note on detecting statistical outliers in psychophysical data
    Pete R. Jones
    Attention, Perception, & Psychophysics, 2019, 81 : 1189 - 1196
  • [36] In Pursuit of Outliers in Multi-dimensional Data Streams
    Sadik, Shiblee
    Gruenwald, Le
    Leal, Eleazar
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 512 - 521
  • [37] Incremental Algorithm for Discovering Frequent Subsequences in Multiple Data Streams
    Al-Mulla, Reem
    Al Aghbari, Zaher
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2011, 7 (04) : 1 - 20
  • [38] Clustering Observations for Detecting Multiple Outliers in Regression Models
    Seo, Han Son
    Yoon, Min
    KOREAN JOURNAL OF APPLIED STATISTICS, 2012, 25 (03) : 503 - 512
  • [39] EMM-CLODS: An Effective Microcluster and Minimal Pruning CLustering-Based Technique for Detecting Outliers in Data Streams
    Bah, Mohamed Jaward
    Wang, Hongzhi
    Zhao, Li-Hui
    Zhang, Ji
    Xiao, Jie
    COMPLEXITY, 2021, 2021
  • [40] Mining multidimensional contextual outliers from categorical relational data
    Tang, Guanting
    Pei, Jian
    Bailey, James
    Dong, Guozhu
    INTELLIGENT DATA ANALYSIS, 2015, 19 (05) : 1171 - 1192