A Hybrid Outlier Detection Method for Health Care Big Data

被引:11
|
作者
Yan, Ke [1 ]
You, Xiaoming [2 ,3 ]
Ji, Xiaobo [1 ]
Yin, Guangqiang [4 ]
Yang, Fan [1 ,5 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
[2] 32 Inst China Elect Technol Grp Corp, Shanghai, Peoples R China
[3] Tongji Univ, Sch Software Engn, Shanghai, Peoples R China
[4] Univ Elect Sci & Technol China, Sch Elect Engn, Chengdu, Peoples R China
[5] Chengdu Community Univ, Chengdu, Peoples R China
关键词
K-Nearest Neighbor; pruning; health care; outlier detection; attribute overlapping rate; case classification quality character; big data; ALGORITHM;
D O I
10.1109/BDCloud-SocialCom-SustainCom.2016.34
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Technology advancements in health care informatics, digitalizing health records, and telemedicine has resulted in rapid growth of health care data. One challenge is how to effectively discover useful and important information out of such massive amount of data through techniques such as data mining. Outlier detection is a typical technique used in many fields to analyze big data. However, for the large scale and high dimensional heath care data, the conventional outlier detection methods are not efficient. This paper proposes a novel hybrid outlier detection method, namely, Pruning-based K-Nearest Neighbor (PB-KNN), which integrates the density-based, cluster based methods and KNN algorithm to conduct effective outlier detection. The proposed PB-KNN adopts the case classification quality character (CCQC) as the medical quality evaluation model and uses the attribute overlapping rate (AOR) algorithm for data classification and dimensionality reduction. To evaluate the performance of the pruning operations in PB-KNN, we conduct extensive experiments. The experiment results show that the PB-KNN method outperforms the k-nearest neighbor (KNN) and local outlier factor (LOF) in terms of the accuracy and efficiency.
引用
收藏
页码:157 / 162
页数:6
相关论文
共 50 条
  • [1] An efficient approach for outlier detection in big sensor data of health care
    Saneja, Bharti
    Rani, Rinkle
    [J]. INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2017, 30 (17)
  • [2] Distributed Local Outlier Detection in Big Data
    Yan, Yizhou
    Cao, Lei
    Kuhlman, Caitlin
    Rundensteiner, Elke
    [J]. KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1225 - 1234
  • [3] Safety: A spatial and feature mixed outlier detection method for big trajectory data
    Wu, Yang
    Fang, Junhua
    Chen, Wei
    Zhao, Pengpeng
    Zhao, Lei
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)
  • [4] A Hybrid Approach for Big Data Outlier Detection from Electric Power SCADA System
    Alves, W.
    Martins, D.
    Bezerra, U.
    Klautau, A.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2017, 15 (01) : 57 - 64
  • [5] A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams
    Alghushairy, Omar
    Alsini, Raed
    Soule, Terence
    Ma, Xiaogang
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (01) : 1 - 24
  • [6] Big Data Outlier Detection Algorithm Based on Grid
    Guo Wei-Wei
    Liu Feng
    [J]. 2018 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2018), 2018, : 274 - 277
  • [7] Implementation of Infrastructure for Streaming Outlier Detection in Big Data
    Hasani, Zirije
    [J]. RECENT ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 2, 2017, 570 : 503 - 511
  • [8] A hybrid dimensionality reduction method for outlier detection in high-dimensional data
    Meng, Guanglei
    Wang, Biao
    Wu, Yanming
    Zhou, Mingzhe
    Meng, Tiankuo
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (11) : 3705 - 3718
  • [9] Hybrid outlier detection (HOD) method in sensor data for human activity classification
    Nivetha, G.
    Venkatalakshmi, K.
    [J]. INTELLIGENT DATA ANALYSIS, 2018, 22 (02) : 245 - 260
  • [10] A hybrid dimensionality reduction method for outlier detection in high-dimensional data
    Guanglei Meng
    Biao Wang
    Yanming Wu
    Mingzhe Zhou
    Tiankuo Meng
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 3705 - 3718