SMARTCRAWLER: A PERSONALIZED WEB SEARCH FOR RELEVANT WEB PAGES

被引:0
|
作者
Wardekar, Arati Anilrao [1 ]
Gupta, Poonam [1 ]
机构
[1] GH Raisoni Coll Engn & Management, Pune 412207, Maharashtra, India
关键词
Web Crawler; Inner web; URL Feature selection; IP; Site frequency; Two-stage crawler; Site Ranking; Personalized web search;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
On web we can see that web pages are not indexed by crawling in speed, it was developed many crawlers to efficiently locate inner web interfaces, due to the large amount of resources in the network and the dynamic nature of the deep web, the better result is a challenging problem. To solve this problem, we propose a two-stage framework, mainly SmartCrawler, to relevantly finding a deep web. Smart-crawler gets the seed from the seed database. First stage, Smart Crawler performs the "reverse search" that matches the user's query in the URLs. In the second step, the "Incremental Site Prioritize" is perform in which the content of the query in the form matches. Then, according to frequency matching, sort relevant and irrelevant pages and rank this page. High-ranking pages are displayed on the results page. Our proposed crawler efficiently recovers deep interfaces from large databases and achieves a higher result than other developed crawlers. We have propose a comprehensive and customized search to improve performance by considering how long we keep the log file. Before viewing the query before entering the query in the search box that is the focus, enter the search box.
引用
收藏
页数:4
相关论文
共 50 条
  • [11] Authoring of Personalized Web Page from Heterogeneous Web Pages by Content Extraction and Integration
    Li, Wei-gang
    Sun, Ke
    Wang, Shuo-chen
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGY (CNCT 2016), 2016, 54 : 734 - 740
  • [12] Automatically Discovering Relevant Images From Web Pages
    Uzun, Erdinc
    Ozhan, Erkan
    Agun, Hayri Volkan
    Yerlikaya, Tarik
    Bulus, Halil Nusret
    IEEE ACCESS, 2020, 8 : 208910 - 208921
  • [13] A heuristic search for relevant images on the web
    Yu, FY
    Ip, HHS
    Leung, CH
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2005, 3568 : 599 - 608
  • [14] BEYOND RANKED LISTS IN WEB SEARCH: AGGREGATING WEB CONTENT INTO TOPIC PAGES
    Balasubramanian, Niranjan
    Cucerzan, Silviu
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2010, 4 (04) : 509 - 534
  • [15] Personalized Web Search Using Clickthrough Data and Web Page Rating
    Peng, XuePing
    Niu, ZhenDong
    Huang, Sheng
    Zhao, Yumin
    JOURNAL OF COMPUTERS, 2012, 7 (10) : 2578 - 2584
  • [16] Caching personalized and database-related dynamic web pages
    Chang, Yeim-Kuan
    Lin, Yu-Ren
    Ting, Yi-Wei
    NAS: 2006 INTERNATIONAL WORKSHOP ON NETWORKING, ARCHITECTURE, AND STORAGES, PROCEEDINGS, 2006, : 149 - +
  • [17] Categorization of web pages - Performance enhancement to search engine
    Lakshminarayana, S.
    KNOWLEDGE-BASED SYSTEMS, 2009, 22 (01) : 100 - 104
  • [18] Visual Snippets: Summarizing Web Pages for Search and Revisitation
    Teevan, Jaime
    Cutrell, Edward
    Fisher, Danyel
    Drucker, Steven M.
    Ramos, Gonzalo
    Andre, Paul
    Hu, Chang
    CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, 2009, : 2023 - 2032
  • [19] Personalized Ranking Model Adaptation for Web Search
    Wang, Hongning
    He, Xiaodong
    Chang, Ming-Wei
    Song, Yang
    White, Ryen W.
    Chu, Wei
    SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, 2013, : 323 - 332
  • [20] PROS: A personalized ranking platform for web search
    Chirita, PA
    Olmedilla, D
    Nejdl, W
    ADAPTIVE HYPERMEDIA AND ADAPTIVE WEB-BASED SYSTEMS, PROCEEDINGS, 2004, 3137 : 34 - 43