Apache Spark Methods and Techniques in Big Data-A Review

被引:2
|
作者
Sahana, H. P. [1 ]
Sanjana, M. S. [1 ]
Muddasir, N. Mohammed [1 ]
Vidyashree, K. P. [1 ]
机构
[1] Vidyavardhaka Coll Engn, Dept Informat Sci & Engn, Mysuru, Karnataka, India
关键词
Apache Spark; Big data; Data processing;
D O I
10.1007/978-981-15-0146-3_67
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Major online sites such as Amazon, eBay, and Yahoo are now adopting Spark. Many organizations run Spark in thousands of nodes available in the clusters. Spark is a "rapid cluster computing" and a broader data processing platform. It has a thirsty and active open-source community. Spark core is the Apache Spark kernel. We discuss in this paper the use and applications of Apache Spark, the mainstream of popular organization. These organizations extract, collect event data from the users' daily use, and engage in real-time interactions with such data. As a result, Apache Spark is a big data next-generation tool. It offers both batch and streaming capabilities to process data more quickly.
引用
收藏
页码:721 / 726
页数:6
相关论文
共 50 条
  • [1] Big data analytics on Apache Spark
    Salloum S.
    Dautov R.
    Chen X.
    Peng P.X.
    Huang J.Z.
    International Journal of Data Science and Analytics, 2016, 1 (3-4) : 145 - 164
  • [2] Performance Analysis of Machine Learning Techniques on Big Data Using Apache Spark
    Mogha, Garima
    Ahlawat, Khyati
    Singh, Amit Prakash
    DATA SCIENCE AND ANALYTICS, 2018, 799 : 17 - 26
  • [3] Big Spatial Data Processing With Apache Spark
    Boyi Shangguan
    Peng Yue
    Wu, Zhaoyan
    Jiang, Liangcun
    2017 6TH INTERNATIONAL CONFERENCE ON AGRO-GEOINFORMATICS, 2017, : 239 - 242
  • [4] Big Data Software Analytics with Apache Spark
    Gousios, Georgios
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, 2018, : 542 - 543
  • [5] Apache Spark: A Big Data Processing Engine
    Shaikh, Eman
    Mohiuddin, Iman
    Alufaisan, Yasmeen
    Nahvi, Irum
    2019 2ND IEEE MIDDLE EAST AND NORTH AFRICA COMMUNICATIONS CONFERENCE (IEEEMENACOMM'19), 2019, : 220 - 225
  • [6] An Approach Towards Big Data-A Review
    Gupta, Palak
    Tyagi, Nidhi
    2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 118 - 123
  • [7] Big Data in metagenomics: Apache Spark vs MPI
    Abuin, Jose M.
    Lopes, Nuno
    Ferreira, Luis
    Pena, Tomas F.
    Schmidt, Bertil
    PLOS ONE, 2020, 15 (10):
  • [8] Scalable Manifold Learning for Big Data with Apache Spark
    Schoeneman, Frank
    Zola, Jaroslaw
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 272 - 281
  • [9] Static and Dynamic Big Data Partitioning on Apache Spark
    Bertolucci, Massimiliano
    Carlini, Emanuele
    Dazzi, Patrizio
    Lulli, Alessandro
    Ricci, Laura
    PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 489 - 498
  • [10] Accelerating Apache Spark Big Data Analysis with FPGAs
    Ghasemi, Ehsan
    Chow, Paul
    2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 94 - 94