WMFP-Outlier: An Efficient Maximal Frequent-Pattern-Based Outlier Detection Approach for Weighted Data Streams

被引:10
|
作者
Cai, Saihua [1 ]
Li, Qian [1 ]
Li, Sicong [1 ]
Yuan, Gang [1 ]
Sun, Ruizhi [1 ,2 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr, Sci Res Base Integrated Technol Precis Agr Anim H, Beijing 100083, Peoples R China
来源
INFORMATION TECHNOLOGY AND CONTROL | 2019年 / 48卷 / 04期
关键词
outlier detection; weighted maximal frequent-pattern mining; weighted data stream; deviation indices; data mining;
D O I
10.5755/j01.itc.48.4.22176
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since outliers are the major factors that affect accuracy in data science, many outlier detection approaches have been proposed for effectively identifying the implicit outliers from static datasets, thereby improving the reliability of the data. In recent years, data streams have been the main form of data, and the data elements in a data stream are not always of equal importance. However, the existing outlier detection approaches do not consider the weight conditions; hence, these methods are not suitable for processing weighted data streams. In addition, the traditional pattern-based outlier detection approaches incur a high time cost in the outlier detection phase. Aiming at overcoming these problems, this paper proposes a two-phase pattern-based outlier detection approach, namely, WMFP-Outlier, for effectively detecting the implicit outliers from a weighted data stream, in which the maximal frequent patterns are used instead of the frequent patterns to accelerate the process of outlier detection. In the process of maximal frequent-pattern mining, the anti-monotonicity property and MFP-array structure are used to accelerate the mining operation. In the process of outlier detection, three deviation indices are designed for measuring the degree of abnormality of each transaction, and the transactions with the highest degrees of abnormality are judged as outliers. Last, several experimental studies are conducted on a synthetic dataset to evaluate the performance of the proposed WMFP-Outlier approach. The results demonstrate that the accuracy of the WMFP-Outlier approach is higher compared to the existing pattern-based outlier detection approaches, and the time cost of the outlier detection phase of WMFP-Outlier is lower than those of the other four compared pattern-based outlier detection approaches.
引用
收藏
页码:505 / 521
页数:17
相关论文
共 50 条
  • [41] Minimal weighted infrequent itemset mining-based outlier detection approach on uncertain data stream
    Saihua Cai
    Ruizhi Sun
    Shangbo Hao
    Sicong Li
    Gang Yuan
    Neural Computing and Applications, 2020, 32 : 6619 - 6639
  • [42] Minimal weighted infrequent itemset mining-based outlier detection approach on uncertain data stream
    Cai, Saihua
    Sun, Ruizhi
    Hao, Shangbo
    Li, Sicong
    Yuan, Gang
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (11): : 6619 - 6639
  • [43] An efficient reference-based approach to outlier detection in large datasets
    Pei, Yaling
    Zaiane, Osmar R.
    Gao, Yong
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 478 - 487
  • [44] An Efficient Outlier Detection and Classification Clustering-Based Approach for WSN
    Al Samara, Mustafa
    Bennis, Ismail
    Abouaissa, Abdelhafid
    Lorenz, Pascal
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [45] An Energy-Efficient Outlier Detection Based on Data Clustering in WSNs
    Kim, Hongyeon
    Min, Jun-Ki
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2014,
  • [46] An Efficient Algorithm for Sliding Window-Based Weighted Frequent Pattern Mining over Data Streams
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (07): : 1369 - 1381
  • [47] An Effective Minimal Probing Approach With Micro-Cluster for Distance-Based Outlier Detection in Data Streams
    Bah, Mohamed Jaward
    Wang, Hongzhi
    Hammad, Mohamed
    Zeshan, Furkh
    Aljuaid, Hanan
    IEEE ACCESS, 2019, 7 : 154922 - 154934
  • [48] MiFI-Outlier: Minimal infrequent itemset-based outlier detection approach on uncertain data stream
    Cai, Saihua
    Li, Sicong
    Yuan, Gang
    Hao, Shangbo
    Sun, Ruizhi
    KNOWLEDGE-BASED SYSTEMS, 2020, 191 (191)
  • [49] INCREMENTAL PRINCIPAL COMPONENT ANALYSIS BASED OUTLIER DETECTION METHODS FOR SPATIOTEMPORAL DATA STREAMS
    Bhushan, Alka
    Sharker, Monir H.
    Karimi, Hassan A.
    ISPRS INTERNATIONAL WORKSHOP ON SPATIOTEMPORAL COMPUTING, 2015, : 67 - 71
  • [50] Improved incremental local outlier detection for data streams based on the landmark window model
    Aihua Li
    Weijia Xu
    Zhidong Liu
    Yong Shi
    Knowledge and Information Systems, 2021, 63 : 2129 - 2155