Application of rough ensemble classifier to web services categorization and focused crawling

被引:4
|
作者
Saha S. [1 ]
Murthy C.A. [1 ]
Pal S.K. [1 ]
机构
[1] Center for Soft Computing Research, Indian Statistical Institute
来源
Web Intelligence and Agent Systems | 2010年 / 8卷 / 02期
关键词
Focused crawling; Rough ensemble classifier; URL prediction; Web service categorization; WSDL tag structure;
D O I
10.3233/WIA-2010-0186
中图分类号
学科分类号
摘要
This paper discusses the applications of rough ensemble classifier [27] in two emerging problems of web mining, the categorization of web services and the topic specific web crawling. Both applications, discussed here, consist of two major steps: (1) split of feature space based on internal tag structure of web services and hypertext to represent in a tensor space model, and (2) combining classifications obtained on different tensor components using rough ensemble classifier. In the first application we have discussed the classification of web services. Two step improvement on the existing classification results of web services has been shown here. In the first step we achieve better classification results over existing, by using tensor space model. In the second step further improvement of the results has been obtained by using Rough set based ensemble classifier. In the second application we have discussed the focused crawling using rough ensemble prediction. Our experiment regarding this application has provided better Harvest rate and better Target recall for focused crawling. © 2010 - IOS Press and the authors. All rights reserved.
引用
收藏
页码:181 / 202
页数:21
相关论文
共 50 条
  • [1] Classification of Web services using tensor space model and rough ensemble classifier
    Saha, Suman
    Murthy, C. A.
    Pal, Sankar K.
    FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2008, 4994 : 508 - 513
  • [2] Rough set based ensemble prediction for topic specific web crawling
    Saha, Suman
    Murthy, C. A.
    Pal, Sankar K.
    ICAPR 2009: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, PROCEEDINGS, 2009, : 153 - 156
  • [3] Application of structured document parsing to focused web crawling
    Patel, Ahmed
    Schmidt, Nikita
    COMPUTER STANDARDS & INTERFACES, 2011, 33 (03) : 325 - 331
  • [4] Rough set based ensemble classifier for web page classification
    Saha, Suman
    Murthy, C. A.
    Pal, Sankar K.
    FUNDAMENTA INFORMATICAE, 2007, 76 (1-2) : 171 - 187
  • [5] Focused Web Crawling Algorithms
    Amrin, Andas
    Xia, Chunlei
    Dai, Shuguang
    JOURNAL OF COMPUTERS, 2015, 10 (04) : 245 - 251
  • [6] Focused crawling for the hidden web
    Liakos, Panagiotis
    Ntoulas, Alexandros
    Labrinidis, Alexandros
    Delis, Alex
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (04): : 605 - 631
  • [7] Focused crawling for the hidden web
    Panagiotis Liakos
    Alexandros Ntoulas
    Alexandros Labrinidis
    Alex Delis
    World Wide Web, 2016, 19 : 605 - 631
  • [8] Reinforcement Learning with Classifier Selection for Focused Crawling
    Partalas, Ioannis
    Paliouras, Georgios
    Vlahavas, Ioannis
    ECAI 2008, PROCEEDINGS, 2008, 178 : 759 - +
  • [9] EFFECTS OF CRAWLING STRATEGIES ON THE PERFORMANCE OF FOCUSED WEB CRAWLING
    Pirkola, Ari
    Talvensaari, Tuomas
    WEBIST 2009: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2009, : 376 - 381
  • [10] A New Framework for Focused Web Crawling
    PENG Tao
    Wuhan University Journal of Natural Sciences, 2006, (05) : 1394 - 1397