Keyword query based focused Web crawler

被引:28
|
作者
Kumar, Manish [1 ]
Bindal, Ankit [1 ]
Gautam, Robin [1 ]
Bhatia, Rajesh [1 ]
机构
[1] PEC Univ Technol, Chandigarh 160012, India
关键词
Web crawler; Information retrieval; Focused Web Crawler; Query based crawler;
D O I
10.1016/j.procs.2017.12.075
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Finding information on Web is a difficult and challenging task because of the extremely large volume of data. Search engine can be used to facilitate this task, but it is still difficult to cover all the webpages present on Web. This paper proposes a query based crawler where a set of keywords relevant to the topic of interest of the user is used to shoot queries on search interface. These search interfaces are found on webpage of the website corresponding to seed URL. This helps crawler to get most relevant links from the domain without actually going in depth of that domain. No existing focused crawling approach uses query based approach to find webpages of interest. In the proposed crawler, list of keywords is passed to the search query interfaces found on the websites. The proposed work will give the most relevant information based on the keywords in a particular domain without actually crawling through many irrelevant links in between them. (C) 2018 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the 6th International Conference on Smart Computing and Communications.
引用
收藏
页码:584 / 590
页数:7
相关论文
共 50 条
  • [21] A Focused Crawler for Web Feature Service and Web Map Service Discovering
    Alexandrino, Victor Macedo
    Comarela, Giovanni
    da Silva, Altigran Soares
    Lisboa-Filho, Jugurta
    WEB AND WIRELESS GEOGRAPHICAL INFORMATION SYSTEMS (W2GIS 2020), 2020, 12473 : 111 - 124
  • [22] Optimized Focused Web Crawler with Natural Language Processing Based Relevance Measure in Bioinformatics Web Sources
    Sekhar, S. R. Mani
    Siddesh, G. M.
    Manvi, Sunilkumar S.
    Srinivasa, K. G.
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2019, 19 (02) : 146 - 158
  • [23] On Functional Requirements for Keyword-based Query over Heterogeneous Databases on the Web
    Leitao-Junior, Plinio S.
    de Lucena, Fabio Nogueira
    Ramada, Mariana Soller
    Ribeiro, Leonardo Andrade
    da Silva, Joao Carlos
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS 2021), VOL 1, 2021, : 224 - 231
  • [24] Automatic Web service composition driven by keyword query
    Dongjin Yu
    Lei Zhang
    Chengfei Liu
    Rui Zhou
    Dengwei Xu
    World Wide Web, 2020, 23 : 1665 - 1692
  • [25] Automatic Web service composition driven by keyword query
    Yu, Dongjin
    Zhang, Lei
    Liu, Chengfei
    Zhou, Rui
    Xu, Dengwei
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (03): : 1665 - 1692
  • [26] A Focused Crawler Based on Correlation Analysis
    Qin, Qiuli
    Peng, Xin
    INTERNATIONAL JOURNAL OF FUTURE GENERATION COMMUNICATION AND NETWORKING, 2014, 7 (06): : 13 - 20
  • [27] Research and realization of E-commerce monitor system based on focused web crawler
    Chen, Xue Gang
    Information Technology Journal, 2013, 12 (17) : 4033 - 4039
  • [28] Ontology based learnable focused crawler
    Software School, Xiamen Univ., Xiamen 361005, China
    不详
    J. Comput. Inf. Syst., 2007, 3 (1173-1180):
  • [29] An ontology-based focused crawler
    Kozanidis, Lefteris
    NATURAL LANGUAGE AND INFORMATION SYSTEMS, PROCEEDINGS, 2008, 5039 : 376 - 379
  • [30] Customized focused crawler for peer-to-peer Web search
    Fang, Qiming
    Yang, Guangwen
    Wu, Yongwei
    Zhu, Anping
    Zheng, Weimin
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2007, 35 (SUPPL. 2): : 148 - 152