Two-phase clustering process for outliers detection

被引:227
|
作者
Jiang, MF [1 ]
Tseng, SS [1 ]
Su, CM [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp & Informat Sci, Hsinchu 30050, Taiwan
关键词
outliers; k-means clustering; two-phase clustering; MST;
D O I
10.1016/S0167-8655(00)00131-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a two-phase clustering algorithm for outliers detection is proposed. Tn;e first modify the traditional k-means algorithm in Phase 1 by using a heuristic "if one new input pattern is far enough away from all clusters centers, then assign it as a new cluster center". It results that the data points in the same cluster may be most likely all outliers or all non-outliers. And then we construct a minimum spanning tree (MST) in Phase 2 and remove the longest edge. The small clusters, the tree with less number of nodes, are selected and regarded as outlier. The experimental results show that our process works well. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:691 / 700
页数:10
相关论文
共 50 条
  • [31] Perceptions of the two-phase pharmacy residency match process
    Fenn, Norman E., III
    Nolt, Valerie D.
    Mediwala, Krutika N.
    Sandgren, Megan J.
    Martin, Christina Y.
    AMERICAN JOURNAL OF HEALTH-SYSTEM PHARMACY, 2019, 76 (24) : 2041 - 2047
  • [32] Reliability Modeling of Two-phase Gamma Degradation Process
    Duan, Fengjun
    Wang, Guanjun
    2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C), 2017, : 620 - 621
  • [33] Numerical simulation of a two-phase flow in the electrospinning process
    Xu, Lan
    Liu, HongYing
    Si, Na
    Lee, Eric Wai Ming
    INTERNATIONAL JOURNAL OF NUMERICAL METHODS FOR HEAT & FLUID FLOW, 2014, 24 (08) : 1755 - 1761
  • [34] Concerning the existence of a two-phase region in the ordering process
    Hultgren, R
    JOURNAL OF CHEMICAL PHYSICS, 1939, 7 (03): : 202 - 203
  • [35] Studies on the criterion for choking process in two-phase flow
    Xu, Hong
    Badea, Aurelian Florin
    Cheng, Xu
    PROGRESS IN NUCLEAR ENERGY, 2021, 133
  • [36] Modelling flowback as a transient two-phase depletion process
    Ezulike, Obinna Daniel
    Dehghanpour, Hassan
    JOURNAL OF NATURAL GAS SCIENCE AND ENGINEERING, 2014, 19 : 258 - 278
  • [37] A two-phase PEMFC model for process control purposes
    Groetsch, Markus
    Mangold, Michael
    CHEMICAL ENGINEERING SCIENCE, 2008, 63 (02) : 434 - 447
  • [38] Outliers Detection Method Using Clustering in Buildings Data
    Habib, Usman
    Zucker, Gerhard
    Bloechle, Max
    Judex, Florian
    Haase, Jan
    IECON 2015 - 41ST ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2015, : 694 - 700
  • [39] DETECTION OF OUTLIERS AND ROBUST ESTIMATION USING FUZZY CLUSTERING
    VANCUTSEM, B
    GATH, I
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1993, 15 (01) : 47 - 61
  • [40] WCDS: A Two-Phase Weightless Neural System for Data Stream Clustering
    Douglas O. Cardoso
    Felipe M. G. França
    João Gama
    New Generation Computing, 2017, 35 : 391 - 416