An Adaptive Clustering Approach for Distributed Outlier Detection in Data Streams

被引:0
|
作者
Della Monaca, Andrea [1 ]
Cafaro, Massimo [1 ,2 ]
Pulimeno, Marco [1 ]
Epicoco, Italo [1 ,2 ]
机构
[1] Univ Salento, Lecce, Italy
[2] Euro Mediterranean Ctr Climate Change Fdn, Lecce, Italy
关键词
Outlier detection; Gossip protocol; Principal component analysis;
D O I
10.1007/978-3-031-20859-1_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many real-world problems deal with collections of high-dimensional data, i.e., data with many different features. A dataset exhibiting a high number of features incurs the so-called curse of dimensionality: when the dimensionality increases, the volume of the space increases at a fast rate, causing the sparseness of the data. This makes challenging clustering high-dimensional data for outlier detection purposes. In this paper, we design and implement a distributed peer to peer version of an algorithm that addresses the curse of dimensionality by generating candidate subspaces from the high-dimensional space through Principal Component Analysis. The experimental results show that if the parameters of the distributed algorithm are properly set, then the distributed algorithm converges to the results provided by the sequential algorithm, which is a fundamental and highly desirable property.
引用
收藏
页码:86 / 99
页数:14
相关论文
共 50 条
  • [1] Continuous adaptive outlier detection on distributed data streams
    Su, Liang
    Han, Weihong
    Yang, Shuqiang
    Zou, Peng
    Jia, Yan
    [J]. HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2007, 4782 : 74 - 85
  • [2] Adaptive Threshold for Outlier Detection on Data Streams
    Clark, James P.
    Liu, Zhen
    Japkowicz, Nathalie
    [J]. 2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 41 - 49
  • [3] A Hybrid Clustering Algorithm for Outlier Detection in Data Streams
    Vijayarani, S.
    Jothi, P.
    [J]. INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2016, 9 (11): : 285 - 295
  • [4] Outlier Detection in Data Streams Using Various Clustering Approaches
    Makkar, Kusum
    Sharma, Meghna
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2015, : 690 - 693
  • [5] An Outlier Detection Algorithm for Data Streams Based on Fuzzy Clustering
    Su, Xiaoke
    Qin, Yuming
    Wan, Renxia
    [J]. PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 109 - 112
  • [6] A Clustering Approach for Anonymizing Distributed Data Streams
    Mohamed, Mona A.
    Nagi, Magdy H.
    Ghanem, Sahar M.
    [J]. PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 9 - 16
  • [7] Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Dalibor Krleža
    Boris Vrdoljak
    Mario Brčić
    [J]. Machine Learning, 2021, 110 : 139 - 184
  • [8] A Framework for Outlier Detection in Evolving Data Streams by Weighting Attributes in Clustering
    Yogita
    Toshniwal, Durga
    [J]. 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING & SECURITY [ICCCS-2012], 2012, 1 : 214 - 222
  • [9] Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Krleza, Dalibor
    Vrdoljak, Boris
    Brcic, Mario
    [J]. MACHINE LEARNING, 2021, 110 (01) : 139 - 184
  • [10] KDE based outlier detection on distributed data streams in multimedia network
    Zheng, Zhigao
    Jeong, Hwa-Young
    Huang, Tao
    Shu, Jiangbo
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (17) : 18027 - 18045