A voting method for the classification of web pages

被引:1
|
作者
Fang, Rui [1 ]
Mikroyannidis, Alexander [1 ]
Theodoulidis, Babis [1 ]
机构
[1] Univ Manchester, Sch Informat, Sackville St, Manchester M60 1QD, Lancs, England
关键词
D O I
10.1109/WI-IATW.2006.23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses web page classification using hypertext features such as the text included in the web page, the title, headings, URL, and anchor text. Five different classification approaches based on SVM that use individual features or combinations are investigated on the LookSmart dataset. The initial experimental results have shown that combining the features improves the performance of the classifier and that some features such as title and headings can be very useful for certain tasks. On the basis of this analysis, we propose a voting method that further improves the performance compared with the individual classifiers.
引用
收藏
页码:610 / +
页数:2
相关论文
共 50 条
  • [1] Voting model for ranking Web pages
    Lifantsev, M
    [J]. IC'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTERNET COMPUTING, 2000, : 143 - 148
  • [2] Fuzzy Clustering Method for Web User Based on Pages Classification
    ZHAN Li-qiang 1
    2.Department of Science of Computer
    [J]. Wuhan University Journal of Natural Sciences, 2004, (05) : 553 - 556
  • [3] Dynamic and hierarchical classification of Web pages
    Choi, B
    Peng, XG
    [J]. ONLINE INFORMATION REVIEW, 2004, 28 (02) : 139 - 147
  • [4] A Novel Framework for Web Pages Classification
    Hu, Ruiguang
    Hu, Weiming
    [J]. PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON MULTIMEDIA TECHNOLOGY (ICMT-13), 2013, 84 : 1061 - 1068
  • [5] Automatic Classification of Uighur Web Pages
    Xu Guixian
    Gao Xu
    Zhao Xiaobing
    Yang Guosheng
    [J]. 2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA), 2013, : 390 - 393
  • [6] Research on Uighur Web Pages Classification
    Xu, Guixian
    [J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 657 - 662
  • [7] A Comparative Study of Web Pages Classification Methods Applied to Health Consumer Web Pages
    Siddiqui, Aneeta
    Adnan, Mehnaz
    Siddiqui, Rizwan Alam
    Mubeen, Tauseef
    [J]. 2015 SECOND INTERNATIONAL CONFERENCE ON COMPUTING TECHNOLOGY AND INFORMATION MANAGEMENT (ICCTIM), 2015, : 43 - 48
  • [8] A Method for Topic Classification of Web Pages Using LDA-SVM Model
    Wei, Yuliang
    Wang, Wei
    Wang, Bailing
    Yang, Bo
    Liu, Yang
    [J]. PROCEEDINGS OF 2017 CHINESE INTELLIGENT AUTOMATION CONFERENCE, 2018, 458 : 589 - 596
  • [9] Study on the pretreatment of web pages based on web text classification
    Li, Runzhi
    Zhang, Yangsen
    [J]. 11TH CHINESE LEXICAL SEMANTICS WORKSHOP (CKSW2010), 2010, : 356 - 360
  • [10] Web Pages Classification with Parliamentary Optimization Algorithm
    Kiziloluk, Soner
    Ozer, Ahmet Bedri
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2017, 27 (03) : 499 - 513