A Novel Density-Based Clustering Approach for Outlier Detection in High-Dimensional Data

被引:2
|
作者
Messaoud, Thouraya Aouled [1 ]
Smiti, Abir [2 ]
Louati, Aymen [1 ]
机构
[1] Univ Jendouba, Inst Super Informat Kef, Jendouba, Tunisia
[2] Inst Super Gest Tunis, LARODEC, Tunis, Tunisia
关键词
Outliers; Feature selection; Clustering; DBSCAN;
D O I
10.1007/978-3-030-29859-3_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier detection is a primary aspect in data-mining and machine learning applications, also known as outlier mining. The importance of outlier detection in medical data came from the fact that outliers may carry some precious information however outlier detection can show very bad performance in the presence of high dimensional data. In this paper, a new outlier detection technique is proposed based on a feature selection strategy to avoid the curse of dimensionality, named Infinite Feature Selection DBSCAN. The main purpose of our proposed method is to reduce the dimensions of a high dimensional data set in order to efficiently identify outliers using clustering techniques. Simulations on real databases proved the effectiveness of our method taking into account the accuracy, the error-rate, F-score and the retrieval time of the algorithm.
引用
收藏
页码:322 / 331
页数:10
相关论文
共 50 条
  • [1] Accelerating Density-Based Subspace Clustering in High-Dimensional Data
    Prinzbach, Juergen
    Lauer, Tobias
    Kiefer, Nicolas
    [J]. 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 474 - 481
  • [2] A density-based clustering algorithm for high-dimensional data with feature selection
    Qi Xianting
    Wang Pan
    [J]. 2016 2ND INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2016, : 114 - 118
  • [3] OUTLIER DETECTION BASED ON DENSITY OF HYPERCUBE IN HIGH-DIMENSIONAL DATA STREAM
    Shou, Zhaoyu
    Zou, Fengbo
    Li, Simin
    Lu, Xianying
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (03): : 873 - 889
  • [4] Unifying Density-Based Clustering and Outlier Detection
    Tao, Yunxin
    Pi, Dechang
    [J]. WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 644 - 647
  • [5] KNN-kernel density-based clustering for high-dimensional multivariate data
    Tran, Thanh N.
    Wehrens, Ron
    Buydens, Lutgarde M. C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (02) : 513 - 525
  • [6] Robust Local Triangular Kernel Density-based Clustering for High-dimensional Data
    Musdholifah, Aina
    Hashim, Siti Zaiton Mohd
    [J]. 2013 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSIT), 2013, : 24 - 32
  • [7] A Novel Density-based Technique for Outlier Detection of High Dimensional Data Utilizing Full Feature Space
    Rehman, Mujeeb Ur
    Khan, Dost Muhammad
    Saher, Najia
    Shahzad, Faisal
    [J]. INFORMATION TECHNOLOGY AND CONTROL, 2021, 50 (01): : 138 - 152
  • [8] A Novel Density-Based Outlier Detection Approach for Low Density Datasets
    Guan, Donghai
    Chen, Kai
    Yuan, Weiwei
    Han, Guangjie
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2017, 18 (07): : 1639 - 1648
  • [9] A Fast Randomized Method for Local Density-Based Outlier Detection in High Dimensional Data
    Minh Quoc Nguyen
    Omiecinski, Edward
    Mark, Leo
    Irani, Danesh
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, 2010, 6263 : 215 - 226
  • [10] ChronoClust: Density-based clustering and cluster high-dimensional time-series data
    Putri, Givanna H.
    Read, Mark N.
    Koprinska, Irena
    Singh, Deeksha
    Rohm, Uwe
    Ashhurst, Thomas M.
    King, Nicholas J. C.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 174 : 9 - 26