The effective classification of the Chinese web pages based on KNN

被引:0
|
作者
Chen, Wenfei [1 ]
Du, Yajun [1 ]
Zhang, Peiying [1 ]
Han, Baochuan [1 ]
机构
[1] College of Mathematics and Computer Engineering, Xihua University, Chengdu, Sichuan, 610039, China
来源
关键词
Feature Selection - Classification (of information);
D O I
暂无
中图分类号
学科分类号
摘要
In order to improve the efficiency and accuracy of classifying the Chinese web pages and help users to locate pages of their interest quickly, this paper presents an efficient feature selection method. We assign weights to different HTML tags and compute the final weight of each word occurred in the document, and then select the representative feature words to describe the document. The method combing the KNN classification algorithm can classify the Chinese web pages effectively. Experimental results demonstrate that the method can reduce the dimension of space and improve precision and recall obviously. © 2010 Binary Information Press.
引用
收藏
页码:2925 / 2932
相关论文
共 50 条
  • [1] A Two-level KNN based Teaching Web Pages Classification Model
    Ma, Dan
    Wang, Hanhu
    Chen, Mei
    2009 INTERNATIONAL CONFERENCE ON NETWORKING AND DIGITAL SOCIETY, VOL 1, PROCEEDINGS, 2009, : 190 - 193
  • [2] Research on Chinese Web Pages Classification Based on Relation links
    Jin, Yining
    Wang, Huabing
    Zhang, Yu
    2012 2ND INTERNATIONAL CONFERENCE ON APPLIED ROBOTICS FOR THE POWER INDUSTRY (CARPI), 2012, : 905 - 908
  • [3] Web Pages Classification: An Effective Approach Based on Text Mining Techniques
    Babapour, Seyed Moein
    Roostaee, Meysam
    2017 IEEE 4TH INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED ENGINEERING AND INNOVATION (KBEI), 2017, : 320 - 323
  • [4] Study on the pretreatment of web pages based on web text classification
    Li, Runzhi
    Zhang, Yangsen
    11TH CHINESE LEXICAL SEMANTICS WORKSHOP (CKSW2010), 2010, : 356 - 360
  • [5] Classification of deep Web databases based on the context of Web pages
    School of Computer Science and Technology, Shandong University, Ji'nan 250101, China
    Ruan Jian Xue Bao, 2008, 2 (267-274):
  • [6] A fuzzy classification based on feature selection for web pages
    Zhang, MY
    Lu, ZD
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 469 - 472
  • [7] The automatic classification of web pages based on neural network
    Zhang, YZ
    Zhao, MS
    Wu, YS
    8TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING, VOLS 1-3, PROCEEDING, 2001, : 570 - 575
  • [8] Ontology-based automatic classification of web pages
    Song, Mu-Hee
    Lim, Soo-Yeon
    Park, Seong-Bae
    Kang, Dong-Jin
    Lee, Sang-Jo
    APPLIED SOFT COMPUTING TECHNOLOGIES: THE CHALLENGE OF COMPLEXITY, 2006, 34 : 483 - 493
  • [9] Classification of Web Pages as Evergreen or Ephemeral based on content
    Javed, Moonis
    Akhtar, Aly
    Yusufzai, Akif Khan
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 1381 - 1385
  • [10] Automatic Chinese Text Classification Based on NSVMDT-KNN
    Xu, QiNan
    Liu, Zhijng
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 410 - 414