Big Data Analysis Using Hadoop Cluster

被引：0

作者：

Saldhi, Ankita ^{[1
]}

Goel, Abhinav ^{[2
]}

Yadav, Dipesh ^{[3
]}

Saldhi, Ankur ^{[4
]}

Saksena, Dhruv ^{[5
]}

Indu, S. ^{[6
]}

机构：

[1] Ctr Dev Telemat, Mandi Rd, Delhi 110030, India

[2] Aardee Solut, Delhi 110059, India

[3] Designo Interior, Delhi 110085, India

[4] Jamia Millia Islamia, Dept Comp Engn, Delhi 110025, India

[5] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[6] Delhi Technol Univ, Elect & Commun Engn Dept, Delhi 110042, India

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC) | 2014年

关键词：

Big data; Hadoop; distributed data processing; data mining; Mappers; Reducers;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Industries keep a check on all statistics of their business and process this data using various data mining techniques to measure profit trends, revenue, growing markets and interesting opportunities to invest. These statistical records keep on increasing and increase very fast. Unfortunately, as the data grows it becomes a tedious task to process such a large data set and extract meaningful information. Also if the data generated is in various formats, its processing possesses new challenges. Owing to its size, big data is stored in Hadoop Distributed File System (HDFS). In this standard architecture, all the DataNodes function parallel but functioning of a single Data Node is still in sequential fashion. This paper proposes to execute tasks assigned to a single Data Node in parallel instead of executing them sequentially. We propose to use a bunch of streaming multi-processors (SMs) for each single Data Node. An SM can have various processors and memory and all SMs run in parallel and independently. We process big data which may be coming from different sources in different formats to run parallelly on a Hadoop cluster, use the proposed technique and yield desired results efficiently. We have applied proposed methodology to the raw data of an industrial firm, for doing intelligent business, with a final objective of finding profit generated for the firm and its trends throughout a year. We have done analysis over a yearlong data as trends generally repeat after a year.

引用

页码：572 / 575

页数：4

共 50 条

[1] Mining the Associated Patterns in Big Data Using Hadoop Cluster
Asha, P.
Jacob, T. Prem
Pravin, A.
Asbern, A.
[J]. INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 1255 - 1263
[2] Big Data Analysis using Apache Hadoop
Manikandan, Shankar Ganesh
Ravi, Siddarth
[J]. 2014 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2014,
[3] Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster
Velusamy, Kaushik
Vijayaraju, Nivetha
Venkitaramanan, Deepthi
Suresh, Greeshma
Madhu, Divya
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2013, 4 (11) : 156 - 161
[4] Performance Modeling and Analysis of a Hadoop Cluster for Efficient Big Data Processing
Lim, JongBeom
Ahnh, Jong-Suk
Lee, Kang-Woo
[J]. ADVANCED SCIENCE LETTERS, 2016, 22 (09) : 2314 - 2319
[5] Information Retrieval Using Hadoop Big Data Analysis
Motwani, Deepak
Madan, Madan Lal
[J]. ADVANCES IN OPTICAL SCIENCE AND ENGINEERING, 2015, 166 : 409 - 415
[6] Hadoop Based Scalable Cluster Deduplication for Big Data
Liu, Qing
Fu, Yinjin
Ni, Guiqiang
Hou, Rui
[J]. 2016 IEEE 36TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW 2016), 2016, : 98 - 105
[7] Demonetization-Twitter Data Analysis using Big Data & Hadoop
Goyal, Malvika
Anuranjana
[J]. PROCEEDINGS 2019 AMITY INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AICAI), 2019, : 156 - 158
[8] Application of Big Data for Medical Data Analysis Using Hadoop Environment
Roobini, M. S.
Lakshmi, M.
[J]. INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 1128 - 1135
[9] Big Data Analysis Using Computational Intelligence and Hadoop: A Study
Gupta, Apoorva
[J]. 2015 2ND INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2015, : 1397 - 1401
[10] Big Data Analysis of Indian Premier League using Hadoop and MapReduce
Paul, Rajdeep
[J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS), 2017,

← 1 2 3 4 5 →