Research on the Improvement of K-Nearest Neighbor Classifier for Imbalanced Text Categorization

被引:0
|
作者
Yang Yanmei [1 ]
Xu Linying [1 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
关键词
Chinese text categorization; KNN; feature selection; SMOTE; Tomek-Links; SAMPLING METHOD; SMOTE;
D O I
10.1109/IMCCC.2018.00204
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Some of the most widely used text classification methods, such as the K-Nearest Neighbor (KNN) algorithm, the Native Bayes (NB) algorithm and the Support Vector Machine (SVM) algorithm, in terms of the good performance in balanced data classification, have performed poorly in imbalanced data classification. To solve this problem, many researchers have come up with their solutions, we also propose a new method to improve the performance of K-Nearest Neighbor classifier on imbalanced classification. In this paper, we combines K-Nearest Neighbor classifier with a new feature selection method called NFS, improved Synthetic Minority Over-sampling Technique (SMOTE) and Tomek Links Under-sampling Technique. The experimental results demonstrate that the improved method has a significant improvement on the classification efficiency of the bias dataset in the K-Nearest Neighbor classifier.
引用
收藏
页码:968 / 972
页数:5
相关论文
共 50 条
  • [21] Novel text classification based on K-nearest neighbor
    Yu, Xiao-Peng
    Yu, Xiao-Gao
    [J]. PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3425 - +
  • [22] Neighbor-weighted K-nearest neighbor for unbalanced text corpus
    Tan, SB
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2005, 28 (04) : 667 - 671
  • [23] A Review of a Text Classification Technique: K-Nearest Neighbor
    Zhou, R. S.
    Wang, Z. J.
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER INFORMATION SYSTEMS AND INDUSTRIAL APPLICATIONS (CISIA 2015), 2015, 18 : 453 - 455
  • [24] A Fast k-Nearest Neighbor Classifier Using Unsupervised Clustering
    Vajda, Szilard
    Santosh, K. C.
    [J]. RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION (RTIP2R 2016), 2017, 709 : 185 - 193
  • [25] Fuzzy parameterized fuzzy soft k-nearest neighbor classifier
    Memis, S.
    Enginoglu, S.
    Erkan, U.
    [J]. NEUROCOMPUTING, 2022, 500 (351-378) : 351 - 378
  • [26] Evaluation of k-Nearest Neighbor classifier performance for direct marketing
    Govindarajan, M.
    Chandrasekaran, R. M.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (01) : 253 - 258
  • [27] Consistency of the k-Nearest Neighbor Classifier for Spatially Dependent Data
    Younso, Ahmad
    Kanaya, Ziad
    Azhari, Nour
    [J]. COMMUNICATIONS IN MATHEMATICS AND STATISTICS, 2023, 11 (03) : 503 - 518
  • [28] A parameter independent fuzzy weighted k-Nearest neighbor classifier
    Biswas, Nimagna
    Chakraborty, Saurajit
    Mullick, Sankha Subhra
    Das, Swagatam
    [J]. PATTERN RECOGNITION LETTERS, 2018, 101 : 80 - 87
  • [29] A fuzzy K-nearest neighbor classifier to deal with imperfect data
    Jose M. Cadenas
    M. Carmen Garrido
    Raquel Martínez
    Enrique Muñoz
    Piero P. Bonissone
    [J]. Soft Computing, 2018, 22 : 3313 - 3330
  • [30] An Algorithm of Incremental Bayesian Classifier Based on K-Nearest Neighbor
    Wang, Dong
    Xiong, Shi-huan
    [J]. MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1455 - 1459