Application of rough ensemble classifier to web services categorization and focused crawling

被引:4
|
作者
Saha S. [1 ]
Murthy C.A. [1 ]
Pal S.K. [1 ]
机构
[1] Center for Soft Computing Research, Indian Statistical Institute
来源
Web Intelligence and Agent Systems | 2010年 / 8卷 / 02期
关键词
Focused crawling; Rough ensemble classifier; URL prediction; Web service categorization; WSDL tag structure;
D O I
10.3233/WIA-2010-0186
中图分类号
学科分类号
摘要
This paper discusses the applications of rough ensemble classifier [27] in two emerging problems of web mining, the categorization of web services and the topic specific web crawling. Both applications, discussed here, consist of two major steps: (1) split of feature space based on internal tag structure of web services and hypertext to represent in a tensor space model, and (2) combining classifications obtained on different tensor components using rough ensemble classifier. In the first application we have discussed the classification of web services. Two step improvement on the existing classification results of web services has been shown here. In the first step we achieve better classification results over existing, by using tensor space model. In the second step further improvement of the results has been obtained by using Rough set based ensemble classifier. In the second application we have discussed the focused crawling using rough ensemble prediction. Our experiment regarding this application has provided better Harvest rate and better Target recall for focused crawling. © 2010 - IOS Press and the authors. All rights reserved.
引用
收藏
页码:181 / 202
页数:21
相关论文
共 50 条
  • [41] A novel incremental parallel web crawler based on focused crawling
    Huang, Qiuyan
    Li, Qingzhong
    Yan, Zhongmin
    Fu, Hong
    Journal of Computational Information Systems, 2013, 9 (06): : 2461 - 2469
  • [42] Focused Deep Web Entrance Crawling by Form Feature Classification
    Wang, Lin
    Hawbani, Ammar
    Wang, Xingfu
    BIG DATA COMPUTING AND COMMUNICATIONS, 2015, 9196 : 79 - 87
  • [43] Using evolution strategy for cooperative focused crawling on semantic web
    Jason J. Jung
    Neural Computing and Applications, 2009, 18 : 213 - 221
  • [44] Using evolution strategy for cooperative focused crawling on semantic web
    Jung, Jason J.
    NEURAL COMPUTING & APPLICATIONS, 2009, 18 (03): : 213 - 221
  • [45] Clustering Web Services for Automatic Categorization
    Liang, Qianhui
    Li, Peipei
    Hung, Patrick C. K.
    Wu, Xindong
    2009 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING, 2009, : 380 - +
  • [46] Meta-evolution strategy to focused crawling on semantic web
    Jung, Jason J.
    Jo, Geun-Sik
    Yeo, Seong-Won
    ARTIFICIAL NEURAL NETWORKS - ICANN 2007, PT 2, PROCEEDINGS, 2007, 4669 : 399 - +
  • [47] An adaptive focused Web crawling algorithm based on learning automata
    Javad Akbari Torkestani
    Applied Intelligence, 2012, 37 : 586 - 601
  • [48] Web Page Segmentation and its Application for Web Information Crawling
    Feng, Hanyang
    Zhang, Wenzhe
    Wu, Hesheng
    Wang, Chong-Jun
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 598 - 605
  • [49] Hybrid Focused Crawling for Homemade Explosives Discovery on Surface and Dark Web
    Iliou, Christos
    Kalpakis, George
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, (ARES 2016), 2016, : 229 - 234
  • [50] An Extended Method for Finding Related Web Pages with Focused Crawling Techniques
    Furuse, Kazutaka
    Ohmura, Hiroaki
    Chen, Hanxiong
    Kitagawa, Hiroyuki
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6882 : 21 - 30