Automatic Web Page Classification Using Various Features

被引:0
|
作者
Wen, Hao [1 ]
Fang, Liping [1 ]
Guan, Ling [2 ]
机构
[1] Ryerson Univ, Dept Mech & Ind Engn, Toronto, ON, Canada
[2] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON, Canada
关键词
Automatic classification; data fusion; ontology;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A model of automatically classifying uncertain Web pages using multiple features is presented. Since the traditional tree structure can barely classify an avalanche of new Web pages, the proposed approach partially uses the idea of "bag of words" incorporating the idea of classification fusion to describe and categorize Web pages. The proposed approach extracts features of Web pages from various perspectives, such as consulting a Web directory service, analyzing the text features of Web pages' titles and meta-search keywords, and identifying primary content of Web pages. Through fusing the results from these three dedicated classifiers, Web pages are classified to one or more categories with a bunch of words representing the Web pages. In order to demonstrate the effectiveness of the proposed method, experiments are carried out. In the experiments, the Web pages arc classified using the proposed fusion method to four categories. A comparison between the dedicated classifiers and fusion methods is also presented.
引用
收藏
页码:368 / +
页数:3
相关论文
共 50 条
  • [1] Automatic Web Page Classification
    Materna, Jiri
    RASLAN 2008: RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING: SECOND WORKSHOP, 2008, : 84 - 93
  • [2] Web Page Classification Using Image Analysis Features
    de Boer, Viktor
    van Someren, Maarten W.
    Lupascu, Tiberiu
    WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2011, 75 : 272 - +
  • [3] Web Page Classification: Features and Algorithms
    Qi, Xiaoguang
    Davison, Brian D.
    ACM COMPUTING SURVEYS, 2009, 41 (02)
  • [4] A Novel Approach for Web Page Classification using Optimum features
    Mangai, J. Alamelu
    Kumar, V. Santhosh
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2011, 11 (05): : 252 - 257
  • [5] Automatic classification of academic web page types
    Patrick Kenekayoro
    Kevan Buckley
    Mike Thelwall
    Scientometrics, 2014, 101 : 1015 - 1026
  • [6] A Chinese Web Page Automatic Classification System
    Huang, Rongyou
    Zhao, Xinjian
    WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 61 - +
  • [7] Automatic classification of academic web page types
    Kenekayoro, Patrick
    Buckley, Kevan
    Thelwall, Mike
    SCIENTOMETRICS, 2014, 101 (02) : 1015 - 1026
  • [8] Automatic Web Page Classification Using Visual Content for Subjective and Functional Variables
    Goncalves, Nuno
    Videira, Antonio
    WEB INFORMATION SYSTEMS AND TECHNOLOGIES, WEBIST 2014, 2015, 226 : 279 - 294
  • [9] Grid-enabled automatic web page classification
    Metikurke, Seema
    Vaishnavi, Vijay K.
    Vandenberg, Art
    Li, Lei
    2006 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2006, : 377 - +
  • [10] Automatic web page classification in a dynamic and hierarchical way
    Peng, XG
    Choi, B
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 386 - 393