NearCount: Selecting critical instances based on the cited counts of nearest neighbors

被引:12
|
作者
Zhu, Zonghai [1 ,2 ]
Wang, Zhe [1 ,2 ]
Li, Dongdong [2 ]
Du, Wenli [1 ]
机构
[1] East China Univ Sci & Technol, Minist Educ, Key Lab Adv Control & Optimizat Chem Proc, Shanghai 200237, Peoples R China
[2] East China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai 200237, Peoples R China
基金
美国国家科学基金会;
关键词
Critical instance; Nearest neighbor; Cited counts; Imbalanced problem; Instance selection; DATA REDUCTION; CLASSIFICATION; SMOTE;
D O I
10.1016/j.knosys.2019.105196
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional instance selection algorithms are not good at addressing imbalanced problems. Moreover, most of them are sensitive to noise instances and suffer from complex selection rules. To solve these problems, in this paper, we propose a concise learning framework named NearCount to determine the importance of the instance without editing noise. In NearCount, the importance of an instance corresponds to the cited counts. The count is determined by the number of times that one instance is selected as a nearest neighbor of instances in different classes. For the instances with nonzero cited counts, the importance of the instance is inversely proportional to the cited count. To handle classification problems with different data distributions, two detailed NearCount-based algorithms - NearCount-IM and NearCount-IS - are introduced. For imbalanced problems, NearCount-IM selects the important majority instances with an equal number of minority instances, thus balancing the data distribution. For balanced scenarios, NearCount-IS selects the instances whose cited counts are greater than zero and equal or less than the number of nearest neighbors as critical instances in every class. The proposed NearCount-IM and NearCount-IS algorithms are evaluated by comparing them with classical instance selection algorithms on the benchmark data sets. Experiments validate the effectiveness of the proposed algorithms. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A New Samples Selecting Method based on K Nearest Neighbors
    Yang, Kai
    Cai, Yi
    Cai, Zhiwei
    Tan, Xingwei
    Xie, Haoran
    Wong, Tak Lam
    Chan, Wai Hong
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2017, : 457 - 462
  • [2] Clustering method based on nearest neighbors representation
    State Key Laboratory for Novel Software Technology , Nanjing
    210023, China
    Ruan Jian Xue Bao, 11 (2847-2855):
  • [3] A hybrid classifier based on boxes and nearest neighbors
    Anthony, Martin
    Ratsaby, Joel
    DISCRETE APPLIED MATHEMATICS, 2014, 172 : 1 - 11
  • [4] Nearest neighbors and continous nearest neighbor queries based on voronoi diagrams
    Wang M.
    Hao Z.
    Information Technology Journal, 2010, 9 (07) : 1467 - 1475
  • [5] A clustering algorithm based absorbing nearest neighbors
    Hu, JJ
    Tang, CJ
    Peng, J
    Li, C
    Yuan, CA
    Chen, AL
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 700 - 705
  • [6] River Flow Prediction Using Dynamic Method for Selecting and Prioritizing K-Nearest Neighbors Based on Data Features
    Ebrahimi, Ehsan
    Shourian, Mojtaba
    JOURNAL OF HYDROLOGIC ENGINEERING, 2020, 25 (05)
  • [7] Binary Classification Based on SVDD Projection and Nearest Neighbors
    Kang, Daesung
    Park, Jooyoung
    Principe, Jose C.
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [8] Center-based indexing for nearest neighbors search
    Wojna, A
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 681 - 684
  • [9] Kernel-Based Transductive Learning with Nearest Neighbors
    Sim, Liangcai
    Wu, Jinhui
    Yu, Lei
    Meng, Weiyi
    ADVANCES IN DATA AND WEB MANAGEMENT, PROCEEDINGS, 2009, 5446 : 345 - 356
  • [10] Selecting reliable instances based on evidence theory for transfer learning
    Lv, Ying
    Zhang, Bofeng
    Yue, Xiaodong
    Denoeux, Thierry
    Yue, Shan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 250