A Hybrid Coupled k-Nearest Neighbor Algorithm on Imbalance Data

被引:0
|
作者
Liu, Chunming [1 ]
Cao, Longbing [1 ]
Yu, Philip S. [2 ]
机构
[1] Univ Technol Sydney, Adv Analyt Inst, Sydney, NSW 2007, Australia
[2] Univ Illinois, Comp Sci, Chicago, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The state-of-the-art classification algorithms rarely consider the relationship between the attributes in the data sets and assume the attributes are independently to each other (IID). However, in real-world data, these attributes are more or less interacted via explicit or implicit relationships. Although the classifiers for class-balanced data are relatively well developed, the classification of class-imbalanced data is not straightforward, especially for mixed type data which has both categorical and numerical features. Limited research has been conducted on the class-imbalanced data. Some algorithms mainly synthesize or remove instances to force the sizes of each class comparable, which may change the inherent data structure or introduces noise to the source data. While for the distance or similarity based algorithms, they ignored the relationship between features when computing the similarity. This paper proposes a hybrid coupled k-nearest neighbor classification algorithm (HC-kNN) for mixed type data, by doing discretization on numerical features to adapt the inter coupling similarity as we do on categorical features, then combing this coupled similarity to the original similarity or distance, to overcome the shortcoming of the previous algorithms. The experiment results demonstrate that our proposed algorithm can get a higher average performance than that of the relevant algorithms (e.g. the variants of kNN, Decision Tree, SMOTE and NaiveBayes).
引用
收藏
页码:2011 / 2018
页数:8
相关论文
共 50 条
  • [1] An Optimized Hybrid Fuzzy Weighted k-Nearest Neighbor with the Presence of Data Imbalance
    Bahanshal, Soha A.
    Baraka, Rebhi S.
    Kim, Bayong
    Verdhan, Vaibhav
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (04) : 660 - 665
  • [2] An Optimized Hybrid Fuzzy Weighted k-Nearest Neighbor with the Presence of Data Imbalance
    Bahanshal, Soha A.
    Baraka, Rebhi S.
    Kim, Bayong
    Verdhan, Vaibhav
    [J]. International Journal of Advanced Computer Science and Applications, 2022, 13 (04): : 660 - 665
  • [3] Comparative Analysis of K-Nearest Neighbor and Modified K-Nearest Neighbor Algorithm for Data Classification
    Okfalisa
    Mustakim
    Gazalba, Ikbal
    Reza, Nurul Gayatri Indah
    [J]. 2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 294 - 298
  • [4] Hybrid Metric K-Nearest Neighbor Algorithm and Applications
    Zhang, Chao
    Zhong, Peisi
    Liu, Mei
    Song, Qingjun
    Liang, Zhongyuan
    Wang, Xiao
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [5] Hybrid SORN Implementation of k-Nearest Neighbor Algorithm on FPGA
    Huelsmeier, Nils
    Baerthel, Moritz
    Karsthof, Ludwig
    Rust, Jochen
    Paul, Steffen
    [J]. 2022 20TH IEEE INTERREGIONAL NEWCAS CONFERENCE (NEWCAS), 2022, : 163 - 167
  • [6] Hybrid k-Nearest Neighbor Classifier
    Yu, Zhiwen
    Chen, Hantao
    Liu, Jiming
    You, Jane
    Leung, Hareton
    Han, Guoqiang
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (06) : 1263 - 1275
  • [7] A FUZZY K-NEAREST NEIGHBOR ALGORITHM
    KELLER, JM
    GRAY, MR
    GIVENS, JA
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1985, 15 (04): : 580 - 585
  • [8] A Modified K-Nearest Neighbor Algorithm to Handle Uncertain Data
    Agrawal, Rashmi
    Ram, Babu
    [J]. 2015 5TH INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2015,
  • [9] A Coupled k-Nearest Neighbor Algorithm for Multi-label Classification
    Liu, Chunming
    Cao, Longbing
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART I, 2015, 9077 : 176 - 187
  • [10] RACEkNN: A hybrid approach for improving the effectiveness of the k-nearest neighbor algorithm
    Ebrahimi, Mahdiyeh
    Basiri, Alireza
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 301