A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams

被引:107
|
作者
Alghushairy, Omar [1 ,2 ]
Alsini, Raed [1 ,3 ]
Soule, Terence [1 ]
Ma, Xiaogang [1 ]
机构
[1] Univ Idaho, Dept Comp Sci, Moscow, ID 83844 USA
[2] Univ Jeddah, Coll Comp Sci & Engn, Jeddah 23890, Saudi Arabia
[3] King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah 21589, Saudi Arabia
基金
美国国家科学基金会;
关键词
outlier detection; data science; local outlier factor; genetic algorithm; stream data mining; NOVELTY DETECTION; EFFICIENT; CLASSIFICATION;
D O I
10.3390/bdcc5010001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier detection is a statistical procedure that aims to find suspicious events or items that are different from the normal form of a dataset. It has drawn considerable interest in the field of data mining and machine learning. Outlier detection is important in many applications, including fraud detection in credit card transactions and network intrusion detection. There are two general types of outlier detection: global and local. Global outliers fall outside the normal range for an entire dataset, whereas local outliers may fall within the normal range for the entire dataset, but outside the normal range for the surrounding data points. This paper addresses local outlier detection. The best-known technique for local outlier detection is the Local Outlier Factor (LOF), a density-based technique. There are many LOF algorithms for a static data environment; however, these algorithms cannot be applied directly to data streams, which are an important type of big data. In general, local outlier detection algorithms for data streams are still deficient and better algorithms need to be developed that can effectively analyze the high velocity of data streams to detect local outliers. This paper presents a literature review of local outlier detection algorithms in static and stream environments, with an emphasis on LOF algorithms. It collects and categorizes existing local outlier detection algorithms and analyzes their characteristics. Furthermore, the paper discusses the advantages and limitations of those algorithms and proposes several promising directions for developing improved local outlier detection methods for data streams.
引用
收藏
页码:1 / 24
页数:24
相关论文
共 50 条
  • [1] A Survey of Outlier Detection Algorithms for Data Streams
    Tamboli, Jinita
    Shukla, Madhu
    [J]. PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3535 - 3540
  • [2] Incremental local outlier detection for data streams
    Pokrajac, Dragojub
    Lazarevic, Aleksandar
    Latecki, Longin Jan
    [J]. 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 504 - 515
  • [3] Distributed Local Outlier Detection in Big Data
    Yan, Yizhou
    Cao, Lei
    Kuhlman, Caitlin
    Rundensteiner, Elke
    [J]. KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1225 - 1234
  • [4] Analysis and Evaluation of Outlier Detection Algorithms in Data Streams
    Shukla, Madhu
    Kosta, Y. P.
    Chauhan, Prashant
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONTROL (IC4), 2015,
  • [5] A Fast and Efficient Local Outlier Detection in Data Streams
    Yang, Xing
    Zhou, Wenli
    Shu, Nanfei
    Zhang, Hao
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO AND SIGNAL PROCESSING (IVSP 2019), 2019, : 111 - 116
  • [6] Interactive Outlier Exploration in Big Data Streams
    Cao, Lei
    Wang, Qingyang
    Rundensteiner, Elke A.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (13): : 1621 - 1624
  • [7] Fast Memory Efficient Local Outlier Detection in Data Streams
    Salehi, Mahsa
    Leckie, Christopher
    Bezdek, James C.
    Vaithianathan, Tharshan
    Zhang, Xuyun
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (12) : 3246 - 3260
  • [8] Online Outlier Detection for Data Streams
    Sadik, Shiblee
    Gruenwald, Le
    [J]. PROCEEDINGS OF THE 15TH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM (IDEAS '11), 2011, : 88 - 96
  • [9] Outlier Detection on Uncertain Data Streams
    Zhu, Bin
    Zhong, Yuling
    Wang, Xite
    Bai, Mei
    [J]. Hunan Daxue Xuebao/Journal of Hunan University Natural Sciences, 2020, 47 (02): : 134 - 140
  • [10] Robust local outlier detection with statistical parameter for big data
    Lei, Jingsheng
    Jiang, Teng
    Wu, Kui
    Du, Haizhou
    Zhu, Lin
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2015, 30 (05): : 411 - 419