Study on Web-page classification algorithm based on rough set theory

被引:0
|
作者
Yin, Shiqun [1 ]
Wang, Fang [1 ]
Xie, Zhong [1 ]
Qiu, Yuhui [1 ]
机构
[1] Southwest Univ, Fac Comp & Informat Sci, Chongqing 400715, Peoples R China
关键词
rough set; classification rule; feature selection; Web-page; vector space model;
D O I
10.1109/ISIP.2008.118
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The large number of Web-page documents is comprise high dimensional huge text database with the development of Internet technology. But it is only a very small portion with the relevant users. The Web-page should be assigned to a category structure through the Web-page classification technology. it is not only convenient for customers to browse Web-page, but also easier to make Web-page seek through restriction search scope. Mining in high dimensional data is extraordinarily difficult because of the curse of dimensionality. We must adopt feature select to solve these problems. A algorithm is given in this paper to reduce the Web-page feature term and extract classification rule at last used attribute reduction on rough set theory. Experimental results show that this method has been greatly reduced feature vector space dimension and gotten easy-to-understand classification rules, and its accuracy is higher and the speed of classification is faster than based on the classification of vector comparison.
引用
收藏
页码:202 / 206
页数:5
相关论文
共 50 条
  • [1] Rough set-aided feature selection for automatic Web-page classification
    Wakaki, T
    Itakura, H
    Tamura, M
    [J]. IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 70 - 76
  • [2] Chinese web-page classification study
    Huang, Weitong
    Lu-Xiong Xu
    Duan, Junfeng
    Lu, Yuchang
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION, VOLS 1-7, 2007, : 2141 - +
  • [3] Rough set based ensemble classifier for web page classification
    Saha, Suman
    Murthy, C. A.
    Pal, Sankar K.
    [J]. FUNDAMENTA INFORMATICAE, 2007, 76 (1-2) : 171 - 187
  • [4] An Algorithm of Semi-supervised Web-page Classification Based on Fuzzy Clustering
    Chen Geng
    Zhu Yuquan
    Tan Jianing
    Hu Tianhan
    [J]. 2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2009, : 3 - +
  • [5] Automatic web services classification based on rough set theory
    Chen Li
    Zhang Ying
    Song Zi-lin
    Miao Zhuang
    [J]. JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2013, 20 (10) : 2708 - 2714
  • [6] Automatic web services classification based on rough set theory
    陈立
    张英
    宋自林
    苗壮
    [J]. Journal of Central South University, 2013, 20 (10) : 2708 - 2714
  • [7] Automatic web services classification based on rough set theory
    Li Chen
    Ying Zhang
    Zi-lin Song
    Zhuang Miao
    [J]. Journal of Central South University, 2013, 20 : 2708 - 2714
  • [8] Noise reduction through summarization for web-page classification
    Shen, Dou
    Yang, Qiang
    Chen, Zheng
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (06) : 1735 - 1747
  • [9] Classification of digital mammography algorithm based on rough set theory
    Hassanien, AE
    Ali, JMH
    [J]. AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2003, 37 (06) : 64 - 71
  • [10] Document representations for classification of short Web-page descriptions
    Radovanovic, Milos
    Ivanovic, Mirjana
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 544 - 553