WMFP-Outlier: An Efficient Maximal Frequent-Pattern-Based Outlier Detection Approach for Weighted Data Streams

被引:10
|
作者
Cai, Saihua [1 ]
Li, Qian [1 ]
Li, Sicong [1 ]
Yuan, Gang [1 ]
Sun, Ruizhi [1 ,2 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr, Sci Res Base Integrated Technol Precis Agr Anim H, Beijing 100083, Peoples R China
来源
INFORMATION TECHNOLOGY AND CONTROL | 2019年 / 48卷 / 04期
关键词
outlier detection; weighted maximal frequent-pattern mining; weighted data stream; deviation indices; data mining;
D O I
10.5755/j01.itc.48.4.22176
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since outliers are the major factors that affect accuracy in data science, many outlier detection approaches have been proposed for effectively identifying the implicit outliers from static datasets, thereby improving the reliability of the data. In recent years, data streams have been the main form of data, and the data elements in a data stream are not always of equal importance. However, the existing outlier detection approaches do not consider the weight conditions; hence, these methods are not suitable for processing weighted data streams. In addition, the traditional pattern-based outlier detection approaches incur a high time cost in the outlier detection phase. Aiming at overcoming these problems, this paper proposes a two-phase pattern-based outlier detection approach, namely, WMFP-Outlier, for effectively detecting the implicit outliers from a weighted data stream, in which the maximal frequent patterns are used instead of the frequent patterns to accelerate the process of outlier detection. In the process of maximal frequent-pattern mining, the anti-monotonicity property and MFP-array structure are used to accelerate the mining operation. In the process of outlier detection, three deviation indices are designed for measuring the degree of abnormality of each transaction, and the transactions with the highest degrees of abnormality are judged as outliers. Last, several experimental studies are conducted on a synthetic dataset to evaluate the performance of the proposed WMFP-Outlier approach. The results demonstrate that the accuracy of the WMFP-Outlier approach is higher compared to the existing pattern-based outlier detection approaches, and the time cost of the outlier detection phase of WMFP-Outlier is lower than those of the other four compared pattern-based outlier detection approaches.
引用
收藏
页码:505 / 521
页数:17
相关论文
共 50 条
  • [1] UWFP-Outlier: an efficient frequent-pattern-based outlier detection method for uncertain weighted data streams
    Cai, Saihua
    Li, Li
    Li, Qian
    Li, Sicong
    Hao, Shangbo
    Sun, Ruizhi
    APPLIED INTELLIGENCE, 2020, 50 (10) : 3452 - 3470
  • [2] UWFP-Outlier: an efficient frequent-pattern-based outlier detection method for uncertain weighted data streams
    Saihua Cai
    Li Li
    Qian Li
    Sicong Li
    Shangbo Hao
    Ruizhi Sun
    Applied Intelligence, 2020, 50 : 3452 - 3470
  • [3] MWFP-outlier: Maximal weighted frequent-pattern-based approach for detecting outliers from uncertain weighted data streams
    Cai, Saihua
    Li, Li
    Chen, Jinfu
    Zhao, Kaiyi
    Yuan, Gang
    Sun, Ruizhi
    Huang, Longxia
    Sosu, Rexford Nii Ayitey
    INFORMATION SCIENCES, 2022, 591 : 195 - 225
  • [4] An efficient approach for outlier detection from uncertain data streams based on maximal frequent patterns
    Cai, Saihua
    Li, Li
    Li, Sicong
    Sun, Ruizhi
    Yuan, Gang
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160
  • [5] An Efficient Outlier Detection Approach on Weighted Data Stream Based on Minimal Rare Pattern Mining
    Cai, Saihua
    Sun, Ruizhi
    Hao, Shangbo
    Li, Sicong
    Yuan, Gang
    CHINA COMMUNICATIONS, 2019, 16 (10) : 83 - 99
  • [6] An Efficient Outlier Detection Approach on Weighted Data Stream Based on Minimal Rare Pattern Mining
    Saihua Cai
    Ruizhi Sun
    Shangbo Hao
    Sicong Li
    Gang Yuan
    China Communications, 2019, 16 (10) : 83 - 99
  • [7] Outlier and anomaly pattern detection on data streams
    Cheong Hee Park
    The Journal of Supercomputing, 2019, 75 : 6118 - 6128
  • [8] Outlier and anomaly pattern detection on data streams
    Park, Cheong Hee
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (09): : 6118 - 6128
  • [9] A Novel Weighted Frequent Pattern-Based Outlier Detection Method Applied to Data Stream
    Yuan, Gang
    Cai, Saihua
    Hao, Shangbo
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2019, : 503 - 510
  • [10] A Fast and Efficient Local Outlier Detection in Data Streams
    Yang, Xing
    Zhou, Wenli
    Shu, Nanfei
    Zhang, Hao
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO AND SIGNAL PROCESSING (IVSP 2019), 2019, : 111 - 116