Focused Web Crawler for Indonesian Recipes

被引:0
|
作者
Alfarisy, Gusti Ahmad Fanshuri [1 ]
Bachtiar, Fitra A. [1 ]
机构
[1] Brawijaya Univ, Fac Comp Sci, Malang, Indonesia
关键词
focused crawler; Indonesia recipe; information retrieval; Jaccard similarity; ALGORITHM;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Crawlers are commonly used to traverse and collect all public webs that are connected through links. The general crawlers could not be used for crawling or collecting web pages with a particular topic such as food recipe. This paper, propose Focused web crawler for Indonesian food recipes using simple classification based on the analysis of Indonesian recipes available on the internet, providing priority levels of a link through anchor text and URLs, and restricting the traverse by the depth. The focused crawler is tested on 4 different query to collect 100 recipes each. The results show that focused web crawler provide higher relevance of 81.75 % than general crawler that uses breath first with 16.00 % relevance. Furthermore, with the same amount of time, focused web crawler is able to collect more relevant web page than the general crawler. Therefore, the proposed crawler can collect recipes on the web based on user query effectively.
引用
收藏
页码:196 / 202
页数:7
相关论文
共 50 条
  • [1] Keyword Focused Web Crawler
    Agre, Gunjan H.
    Mahajan, Nikita V.
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS), 2015, : 1089 - 1092
  • [2] Smart Focused Web Crawler for Hidden Web
    Kaur, Sawroop
    Geetha, G.
    [J]. INFORMATION AND COMMUNICATION TECHNOLOGY FOR COMPETITIVE STRATEGIES, 2019, 40 : 419 - 427
  • [3] A Focused Crawler for Dark Web Forums
    Fu, Tianjun
    Abbasi, Ahmed
    Chen, Hsinchun
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (06): : 1213 - 1231
  • [4] A Framework of a Hybrid Focused Web Crawler
    Sun, Yixue
    Jin, Peiquan
    Yue, Lihua
    [J]. 2008 SECOND INTERNATIONAL CONFERENCE ON FUTURE GENERATION COMMUNICATION AND NETWORKING SYMPOSIA, VOLS 1-5, PROCEEDINGS, 2008, : 146 - 149
  • [5] An algorithm OFC for the focused web crawler
    Zhu, Qiang
    [J]. PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 4059 - 4063
  • [6] Keyword query based focused Web crawler
    Kumar, Manish
    Bindal, Ankit
    Gautam, Robin
    Bhatia, Rajesh
    [J]. 6TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS, 2018, 125 : 584 - 590
  • [7] LEARNING-based Focused WEB Crawler
    Kumar, Naresh
    Aggarwal, Dhruv
    [J]. IETE JOURNAL OF RESEARCH, 2023, 69 (04) : 2037 - 2045
  • [8] A Survey about Algorithms Utilized by Focused Web Crawler
    Yong-Bin Yu
    Shi-Lei Huang
    Nyima Tashi
    Huan Zhang
    Fei Lei
    Lin-Yang Wu
    [J]. Journal of Electronic Science and Technology, 2018, 16 (02) : 129 - 138
  • [9] Weakly supervised learning for an effective focused web crawler
    Dhanith, P. R. Joe
    Saeed, Khalid
    Rohith, G.
    Raja, S. P.
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [10] An improved focused web crawler based on hybrid similarity
    Shang, Songtao
    Wu, Huaiguang
    Ma, Jiangtao
    [J]. International Journal of Performability Engineering, 2019, 15 (10) : 2645 - 2656