SMARTCRAWLER: A PERSONALIZED WEB SEARCH FOR RELEVANT WEB PAGES

被引:0
|
作者
Wardekar, Arati Anilrao [1 ]
Gupta, Poonam [1 ]
机构
[1] GH Raisoni Coll Engn & Management, Pune 412207, Maharashtra, India
关键词
Web Crawler; Inner web; URL Feature selection; IP; Site frequency; Two-stage crawler; Site Ranking; Personalized web search;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
On web we can see that web pages are not indexed by crawling in speed, it was developed many crawlers to efficiently locate inner web interfaces, due to the large amount of resources in the network and the dynamic nature of the deep web, the better result is a challenging problem. To solve this problem, we propose a two-stage framework, mainly SmartCrawler, to relevantly finding a deep web. Smart-crawler gets the seed from the seed database. First stage, Smart Crawler performs the "reverse search" that matches the user's query in the URLs. In the second step, the "Incremental Site Prioritize" is perform in which the content of the query in the form matches. Then, according to frequency matching, sort relevant and irrelevant pages and rank this page. High-ranking pages are displayed on the results page. Our proposed crawler efficiently recovers deep interfaces from large databases and achieves a higher result than other developed crawlers. We have propose a comprehensive and customized search to improve performance by considering how long we keep the log file. Before viewing the query before entering the query in the search box that is the focus, enter the search box.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Exploring content and linkage structures for searching relevant web pages
    Davis, Darren
    Jiang, Eric
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2007, 4632 : 15 - +
  • [32] Semantic technologies for mobile Web and personalized ranking of mobile Web search results
    Sakkopoulos, Evangelos
    METADATA AND SEMANTICS, 2009, : 299 - 308
  • [33] Information retrieval using semantic web browser - Personalized and categorical web search
    Sumalatha, M. R.
    Vaidehi, V.
    Kannan, A.
    Anandhi, S.
    2007 INTERNATIONAL CONFERENCE OF SIGNAL PROCESSING, COMMUNICATIONS AND NETWORKING, VOLS 1 AND 2, 2006, : 238 - +
  • [34] Sink web pages of web application
    Popescu, Doru Anastasiu
    Szabo, Zoltan
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON VIRTUAL LEARNING, ICVL 2010, 2010, : 375 - 379
  • [35] From Web Pages to Web Communities
    Kudelka, Milos
    Snasel, Vaclav
    Horak, Zdenek
    Hassanien, Aboul Ella
    DATESO 2009 - DATABASES, TEXTS, SPECIFICATIONS, OBJECTS: PROCEEDINGS OF THE 9TH ANNUAL INTERNATIONAL WORKSHOP, 2009, 471 : 13 - 22
  • [36] Personalized Web Page Recommendation Based on Preference Footprint to Browsed Pages
    Serizawa, Kenta
    Kamei, Sayaka
    Hayashi, Syuhei
    Fujita, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (11) : 2705 - 2715
  • [37] User interest detection on web pages for building personalized information agent
    Liu, Y
    Liu, WY
    Jiang, CJ
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 280 - 290
  • [38] The effect of system response time on visual search in Web pages
    van Schaik, P
    Ling, J
    ELECTRONIC LIBRARY, 2004, 22 (03): : 264 - 268
  • [39] The effect of text and background colour on visual search of Web pages
    Ling, J
    van Schaik, P
    DISPLAYS, 2002, 23 (05) : 223 - 230
  • [40] Analysis of Duplicated Web Pages Identification Methods in Search Engine
    Duan, Fei
    Zheng, Yan
    2010 2ND INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS PROCEEDINGS (DBTA), 2010,