An efficient algorithm for distributed density-based outlier detection on big data

被引:55
|
作者
Bai, Mei [1 ]
Wang, Xite [1 ]
Xin, Junchang [1 ]
Wang, Guoren [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
Density-based outlier; Local outlier factor; Distributed algorithm; MINING OUTLIERS; CUES;
D O I
10.1016/j.neucom.2015.05.135
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The outlier detection is a popular issue in the area of data management and multimedia analysis, and it can be used in many applications such as detection of noisy images, credit card fraud detection, network intrusion detection. The density-based outlier is an important definition of outlier, whose target is to compute a Local Outlier Factor (LOF) for each tuple in a data set to represent the degree of this tuple to be an outlier. It shows several significant advantages comparing with other existing definitions. This paper focuses on the problem of distributed density-based outlier detection for large-scale data. First, we propose a Gird-Based Partition algorithm (GBP) as a data preparation method. GBP first splits the data set into several grids, and then allocates these grids to the datanodes in a distributed environment. Second, we propose a Distributed LOF Computing method (DLC) for detecting density-based outliers in parallel, which only needs a small amount of network communications. At last, the efficiency and effectiveness of the proposed approaches are verified through a series of simulation experiments. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:19 / 28
页数:10
相关论文
共 50 条
  • [1] A distributed density-based outlier detection algorithm on big data
    Mei, Lin
    Zhang, Fengli
    [J]. International Journal of Network Security, 2020, 22 (05): : 775 - 781
  • [2] Relative Density-Based Outlier Detection Algorithm
    Ning, Jin
    Chen, Leiting
    Chen, Junwei
    [J]. PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 227 - 231
  • [3] Density-based trajectory outlier detection algorithm
    Zhipeng Liu
    Dechang Pi
    Jinfeng Jiang
    [J]. Journal of Systems Engineering and Electronics, 2013, 24 (02) : 335 - 340
  • [4] Density-based trajectory outlier detection algorithm
    Liu, Zhipeng
    Pi, Dechang
    Jiang, Jinfeng
    [J]. JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2013, 24 (02) : 335 - 340
  • [5] An Efficient Density-Based Local Outlier Detection Approach for Scattered Data
    Su, Shubin
    Xiao, Limin
    Ruan, Li
    Gu, Fei
    Li, Shupan
    Wang, Zhaokai
    Xu, Rongbin
    [J]. IEEE ACCESS, 2019, 7 : 1006 - 1020
  • [6] DP_DETECTION: An outlier detection algorithm based on density of big data
    Li, Xiaodi
    Deng, Ping
    Huang, Ming
    Li, Dingcheng
    Wang, Hongjun
    [J]. DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 534 - 544
  • [7] Density-Based Local Outlier Detection on Uncertain Data
    Cao, Keyan
    Shi, Lingxu
    Wang, Guoren
    Han, Donghong
    Bai, Mei
    [J]. WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 67 - 71
  • [8] Cludoop: An Efficient Distributed Density-Based Clustering for Big Data Using Hadoop
    Yu, Yanwei
    Zhao, Jindong
    Wang, Xiaodong
    Wang, Qin
    Zhang, Yonggang
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [9] Big data outlier detection model based on improved density peak algorithm
    Shao, Mengliang
    Qi, Deyu
    Xue, Huili
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (04) : 6185 - 6194
  • [10] Density-Based Evolutionary Outlier Detection
    Banerjee, Amit
    [J]. PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12), 2012, : 651 - 652