NoSQL Web Crawler Application

被引：0

作者：

Deka, Ganesh Chandra ^{[1
]}

机构：

[1] Minist Skill Dev & Entrepreneurship, Directorate Gen Training, New Delhi, India

来源：

DEEP DIVE INTO NOSQL DATABASES: THE USE CASES AND APPLICATIONS | 2018年 / 109卷

关键词：

D O I：

10.1016/bs.adcom.2017.08.001

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the advent of Web technology, the Web is full of unstructured data called Big Data. However, these data are not easy to collect, access, and process at large scale. Web Crawling is an optimization problem. Site-specific crawling of various social media platforms, e-Commerce websites, Blogs, News websites, and Forums is a requirement for various business organizations to answer a search quarry from webpages. Indexing of huge number of webpage requires a cluster with several petabytes of usable disk. Since the NoSQL databases are highly scalable, use of NoSQL database for storing the Crawler data is increasing along with the growing popularity of NoSQL databases. This chapter discusses about the application of NoSQL database in Web Crawler application to store the data collected by the Web Crawler.

引用

页码：77 / 100

页数：24

共 50 条

[1] Study And Application of Web Crawler Algorithm Based on Heritrix
Liu, DongFei
Fan, XianShuang
ADVANCED RESEARCH ON INFORMATION SCIENCE, AUTOMATION AND MATERIAL SYSTEM, PTS 1-6, 2011, 219-220 : 1069 - 1072
[2] Design and Application of Intelligent Dynamic Crawler for Web Data Mining
Zheng Guojun
Jia Wenchao
Shi Jihui
Shi Fan
Zhu Hao
Liu Jiang
2017 32ND YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2017, : 1098 - 1105
[3] Design Crawler: A Web Application For Digital Design Metadata Analysis
Hosny, Sherif
Baher, Amr
2019 20TH INTERNATIONAL WORKSHOP ON MICROPROCESSOR/SOC TEST, SECURITY AND VERIFICATION (MTV 2019), 2019, : 31 - 34
[4] Application of bloom filter for duplicate URL detection in a web crawler
Kapoor, Aveksha
Arora, Vinay
2016 IEEE 2ND INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (IEEE CIC), 2016, : 246 - 255
[5] IMPLEMENTATION OF WEB CRAWLER
Gupta, Pooja
Johari, Kalpana
2009 SECOND INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ENGINEERING AND TECHNOLOGY (ICETET 2009), 2009, : 775 - 780
[6] Reducing web crawler overhead using mobile crawler
M.E. Computer Science and Engineering, Arunai Engineering College, Tiruvannamalai-606 603, Tamil Nadu, India
不详
Int. Conf. Emerg. Trends Electr. Comput. Technol., ICETECT, 2011, (926-932):
[7] An architecture for a focused trend parallel Web crawler with the application of clickstrearn analysis
Ahmadi-Abkenari, Fatemeh
Selamat, Ali
INFORMATION SCIENCES, 2012, 184 (01) : 266 - 281
[8] Performance Aspects of Migrating a Web Application from a Relational to a NoSQL Database
Harezlak, Katarzyna
Skowron, Robert
BEYOND DATABASES, ARCHITECTURES AND STRUCTURES, BDAS 2015, 2015, 521 : 107 - 115
[9] Web Crawler for searching Deep web sites
Patil, Tejaswini Arun
Chobe, Santosh
2017 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2017,
[10] Design of a Mobile Web Crawler for Hidden Web
Kumar, Manish
Bhatia, Rajesh
2016 3rd International Conference on Recent Advances in Information Technology (RAIT), 2016, : 186 - 190

← 1 2 3 4 5 →