Focused Crawler Based on Reinforcement Learning and Decaying Epsilon-Greedy Exploration Policy

被引:1
|
作者
Kaleel, Parisa Begum [1 ]
Sheen, Shina [1 ]
机构
[1] PSG Coll Technol, Dept Appl Math & Computat Sci, Coimbatore, India
关键词
Focused web crawling; infertility; information retrieval; reinforcement learning;
D O I
10.34028/iajit/20/5/14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to serve a diversified user base with a range of purposes, general search engines offer search results for a wide variety of topics and material categories on the Internet. While Focused Crawlers (FC) deliver more specialized and targeted results inside particular domains or verticals, general search engines give a wider coverage of the web. For a vertical search engine, the performance of a focused crawler is extremely important, and several ways of improvement are applied. We propose an intelligent, focused crawler which uses Reinforcement Learning (RL) to prioritize the hyperlinks for long-term profit. Our implementation differs from other RL based works by encouraging learning at an early stage using a decaying.greedy policy to select the next link and hence enables the crawler to use the experience gained to improve its performance with more relevant pages. With an increase in the infertility rate all over the world, searching for information regarding the issues and details about artificial reproduction treatments available is in need by many people. Hence, we have considered infertility domain as a case study and collected web pages from scratch. We compare the performance of crawling tasks following.-greedy and decaying.-greedy policies. Experimental results show that crawlers following a decaying.-greedy policy demonstrate better performance.
引用
收藏
页码:819 / 830
页数:12
相关论文
共 50 条
  • [1] Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation
    Dann, Christoph
    Mansour, Yishay
    Mohri, Mehryar
    Sekhari, Ayush
    Sridharan, Karthik
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning
    Gimelfarb, Michael
    Sanner, Scott
    Lee, Chi-Guhn
    [J]. 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 476 - 485
  • [3] Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax
    Tokic, Michel
    Palm, Guenther
    [J]. KI 2011: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 7006 : 335 - 346
  • [4] Sequential crowdsourced labeling as an epsilon-greedy exploration in a Markov Decision Process
    Raykar, Vikas C.
    Agrawal, Priyanka
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 33, 2014, 33 : 832 - 840
  • [5] Optimal path planning method based on epsilon-greedy Q-learning algorithm
    Bulut, Vahide
    [J]. JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING, 2022, 44 (03)
  • [6] Optimal path planning method based on epsilon-greedy Q-learning algorithm
    Vahide Bulut
    [J]. Journal of the Brazilian Society of Mechanical Sciences and Engineering, 2022, 44
  • [7] Epsilon-greedy Strategy for Online Dictionary Learning with Realistic Memristor Array Constraints
    Cai, Fuxi
    Lu, Wei D.
    [J]. PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL SYMPOSIUM ON NANOSCALE ARCHITECTURES (NANOARCH 2017), 2017, : 19 - 20
  • [8] Adaptive ε-Greedy Exploration in Reinforcement Learning Based on Value Differences
    Tokic, Michel
    [J]. KI 2010: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2010, 6359 : 203 - 210
  • [9] ACTOR-CRITIC ALGORITHMS WITH epsilon-GREEDY GAUSSIAN POLICY IN MULTIDIMENSIONAL CONTINUOUS ACTION SPACES
    Zhang, Chunyuan
    Zhu, Qingxin
    Ou, Yigui
    Niu, Xinzheng
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2016, 12 (03): : 941 - 957
  • [10] A multi-objective hyper-heuristic algorithm based on adaptive epsilon-greedy selection
    Tailong Yang
    Shuyan Zhang
    Cuixia Li
    [J]. Complex & Intelligent Systems, 2021, 7 : 765 - 780