Automatic Web Page Classification Using Various Features

被引:0
|
作者
Wen, Hao [1 ]
Fang, Liping [1 ]
Guan, Ling [2 ]
机构
[1] Ryerson Univ, Dept Mech & Ind Engn, Toronto, ON, Canada
[2] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON, Canada
关键词
Automatic classification; data fusion; ontology;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A model of automatically classifying uncertain Web pages using multiple features is presented. Since the traditional tree structure can barely classify an avalanche of new Web pages, the proposed approach partially uses the idea of "bag of words" incorporating the idea of classification fusion to describe and categorize Web pages. The proposed approach extracts features of Web pages from various perspectives, such as consulting a Web directory service, analyzing the text features of Web pages' titles and meta-search keywords, and identifying primary content of Web pages. Through fusing the results from these three dedicated classifiers, Web pages are classified to one or more categories with a bunch of words representing the Web pages. In order to demonstrate the effectiveness of the proposed method, experiments are carried out. In the experiments, the Web pages arc classified using the proposed fusion method to four categories. A comparison between the dedicated classifiers and fusion methods is also presented.
引用
收藏
页码:368 / +
页数:3
相关论文
共 50 条
  • [41] Web page classification using modified naive Bayesian approach
    Tomar, G. S.
    Verma, Shekhar
    Jha, Ashish
    TENCON 2006 - 2006 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2006, : 1016 - +
  • [42] Web page feature selection and classification using neural networks
    Selamat, A
    Omatu, S
    INFORMATION SCIENCES, 2004, 158 : 69 - 88
  • [43] Block classification of a web page by using a combination of multiple classifiers
    Kang, Jinbeom
    Choi, Joongmin
    NCM 2008: 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 2, PROCEEDINGS, 2008, : 290 - 295
  • [44] Rough set-aided feature selection for automatic Web-page classification
    Wakaki, T
    Itakura, H
    Tamura, M
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 70 - 76
  • [45] Web Page Segmentation with Structured Prediction and its Application in Web Page Classification
    Bing, Lidong
    Guo, Rui
    Lam, Wai
    Niu, Zheng-Yu
    Wang, Haifeng
    SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 767 - 776
  • [46] Enhancing Web Page Classification Models
    Elsalmy, Fayrouz
    Ismail, Rasha
    AbdelMoez, Walid
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 742 - 750
  • [47] Mixture Models for Web Page Classification
    Bai JingHua
    Zhang XiaoXian
    Li ZhiXin
    Li XiaoPing
    INTERNATIONAL CONFERENCE ON SOLID STATE DEVICES AND MATERIALS SCIENCE, 2012, 25 : 499 - 505
  • [48] Ensemble approach for web page classification
    Gupta, Amit
    Bhatia, Rajesh
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (16) : 25219 - 25240
  • [49] Heterogeneous learner for web page classification
    Yu, HJ
    Chang, KCC
    Han, JW
    2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 538 - 545
  • [50] Towards Effective Web Page Classification
    Gu, Min
    Zhu, Feng
    Guo, Qing
    Gu, Yanhui
    Zhou, Junsheng
    Qu, Weiguang
    2016 INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC AND SOCIO-CULTURAL COMPUTING (BESC), 2016, : 126 - 127