Bayesian Network-Based Detection And Prediction of Outliers in Subspace

被引:0
|
作者
Zhou, Lihua [1 ]
Liu, Weiyi [1 ]
Chen, Hongmei [1 ]
Wang, Lizhen [1 ]
Chen, Jilong [1 ]
Yang, Xiaodong [1 ]
机构
[1] Yunnan Univ, Sch Informat, Kunming 650091, Peoples R China
关键词
Data Mining; Bayesian Network; Outlier; Subspace;
D O I
10.1109/WCICA.2008.4593313
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Outlier detection in large data sets is an effective way to find new knowledge and new classes. It has many applications in all those domains that can lead to illegal or abnormal behavior, such as fraud detection and medical diagnosis. Present algorithms mainly focus on finding outliers in data set on the basis of all attributes of data. The concept of outlier in subspace that assumes outlier is only relative to some attributes instead of all attributes is proposed, and the relationship between outlier in subspace and in full space is discussed. A subspace outlier, an event which has small probability in data set can be found and a new data can be predicted by using inference ability of Bayesian Network based on the relationship between outlier and Bayesian Network. A subspace outlier has clear significance which can be explained easily.
引用
收藏
页码:2479 / 2485
页数:7
相关论文
共 6 条
  • [1] ANGIULLI F, 2006, IEEE T KNOWLEDGE DAT, V18
  • [2] Breunig M. M, 2000, P INT C MAN DAT SIGM
  • [3] Hawkins D.M, 1980, IDENTIFICATION OUTLI, V11, DOI [10.1007/978-94-015-3994-4, DOI 10.1007/978-94-015-3994-4]
  • [4] Jin W., 2001, MINING TOP N LOCAL O, P293
  • [5] Knorr E. M., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P392
  • [6] Distance-based outliers: algorithms and applications
    Knorr, EM
    Ng, RT
    Tucakov, V
    [J]. VLDB JOURNAL, 2000, 8 (3-4): : 237 - 253