Noise reduction to text categorization based on density for KNN

被引:7
|
作者
Li, RL [1 ]
Hu, YF [1 ]
机构
[1] Fudan Univ, Comp Technol & Informat Dept, Shanghai 200433, Peoples R China
关键词
text classification; k-Nearest Neighbor;
D O I
10.1109/ICMLC.2003.1260115
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid development of World Wide Web,. text classification has become the key technology in organizing and processing large amount of document data. As a simple and effective classification approach, KNN method is widely used in text categorization. But KNN classifier not only has the large computational demands, but also may result in the decrease of precision of classification because of uneven density of training data. In this paper, we present a density-based method for reducing the noises of training data, which solves these problems. Our experiment results also illustrate it.
引用
收藏
页码:3119 / 3124
页数:6
相关论文
共 50 条
  • [41] Text Categorization Based on Topic Model
    School of Computer Science and Technology, China University of Mining and Technology, Jiangsu Province, Xuzhou
    221116, China
    不详
    100081, China
    [J]. Int. J. Comput. Intell. Syst., 2009, 4 (398-409): : 398 - 409
  • [42] A Learning Based Handwritten Text Categorization
    Sarker, Goutam
    Dhua, Silpi
    Besra, Monica
    [J]. 2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER ENGINEERING AND APPLICATIONS (ICACEA), 2015, : 465 - 471
  • [43] Text categorization based on subtopic clusters
    Chik, FCY
    Luk, RWP
    Chung, KFL
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2005, 3513 : 203 - 214
  • [44] Text categorization based on domain ontology
    He, QM
    Qiu, L
    Zhao, GT
    Wang, SK
    [J]. WEB INFORMATION SYSTEMS - WISE 2004, PROCEEDINGS, 2004, 3306 : 319 - 324
  • [45] Text Categorization Based on Topic Model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2009, 2 (04) : 398 - 409
  • [46] Research of Text Categorization Based on Ontology
    Wang Jiayun
    Zhang Rui
    Wang Peng
    [J]. PROCEEDINGS OF 2009 CONFERENCE ON COMMUNICATION FACULTY, 2009, : 167 - 170
  • [47] Research of Text Categorization Based on SVM
    Wang, Meihua
    Zhang, Hongbin
    Ding, Renshuang
    [J]. PROCEEDINGS OF THE 2011 INTERNATIONAL CONFERENCE ON INFORMATICS, CYBERNETICS, AND COMPUTER ENGINEERING (ICCE2011), VOL 2: INFORMATION SYSTEMS AND COMPUTER ENGINEERING, 2011, 111 : 69 - 77
  • [48] Text categorization based on topic model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2008, 5009 : 572 - 579
  • [49] Macro Features Based Text Categorization
    Wang, Dandan
    Chen, Qingcai
    Wang, Xiaolong
    Tang, Buzhou
    [J]. NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 211 - 219
  • [50] A rough set-based CBR approach for feature and document reduction in text categorization
    Li, Y
    Shiu, SCK
    Pal, SK
    Liu, JNK
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 2438 - 2443