Apache Spark Methods and Techniques in Big Data-A Review

被引:2
|
作者
Sahana, H. P. [1 ]
Sanjana, M. S. [1 ]
Muddasir, N. Mohammed [1 ]
Vidyashree, K. P. [1 ]
机构
[1] Vidyavardhaka Coll Engn, Dept Informat Sci & Engn, Mysuru, Karnataka, India
关键词
Apache Spark; Big data; Data processing;
D O I
10.1007/978-981-15-0146-3_67
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Major online sites such as Amazon, eBay, and Yahoo are now adopting Spark. Many organizations run Spark in thousands of nodes available in the clusters. Spark is a "rapid cluster computing" and a broader data processing platform. It has a thirsty and active open-source community. Spark core is the Apache Spark kernel. We discuss in this paper the use and applications of Apache Spark, the mainstream of popular organization. These organizations extract, collect event data from the users' daily use, and engage in real-time interactions with such data. As a result, Apache Spark is a big data next-generation tool. It offers both batch and streaming capabilities to process data more quickly.
引用
收藏
页码:721 / 726
页数:6
相关论文
共 50 条
  • [21] Big Data Analytics for the ATLAS EventIndex Project with Apache Spark
    Casani, Alvaro Fernandez
    Montoro, Carlos Garcia
    de la Hoz, Santiago Gonzalez
    Salt, Jose
    Sanchez, Javier
    Perez, Miguel Villaplana
    COMPUTATIONAL AND MATHEMATICAL METHODS, 2023, 2023
  • [22] Big Data Network Flow Processing Using Apache Spark
    Jerabek, Kamil
    Rysavy, Ondrej
    PROCEEDINGS OF THE 6TH CONFERENCE ON THE ENGINEERING OF COMPUTER BASED SYSTEMS (ECBS 2019), 2020,
  • [23] MaRe: Processing Big Data with application containers on Apache Spark
    Capuccini, Marco
    Dahlo, Martin
    Toor, Salman
    Spjuth, Ola
    GIGASCIENCE, 2020, 9 (05):
  • [24] SparkJNI: A Toolchain for Hardware Accelerated Big Data Apache Spark
    Voicu, Tudor Alexandru
    Al-Ars, Zaid
    2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019), 2019, : 152 - 157
  • [25] Big Data Machine Learning using Apache Spark MLlib
    Assefi, Mehdi
    Behravesh, Ehsun
    Liu, Guangchi
    Tafti, Ahmad P.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3492 - 3498
  • [26] BigDebug: Interactive Debugger for Big Data Analytics in Apache Spark
    Gulzar, Muhammad Ali
    Interlandi, Matteo
    Condie, Tyson
    Kim, Miryung
    FSE'16: PROCEEDINGS OF THE 2016 24TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2016, : 1033 - 1037
  • [27] A Survey of Scheduling Tasks in Big Data: Apache Spark<bold> </bold>
    Hasan, Balqees Talal
    Abdullah, Dhuha Basheer
    MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING, ICMETE 2021, 2022, 373 : 405 - 414
  • [28] Approx-SMOTE: Fast SMOTE for Big Data on Apache Spark
    Juez-Gil, Mario
    Arnaiz-Gonzalez, Alvar
    Rodriguez, Juan J.
    Lopez-Nozal, Carlos
    Garcia-Osorio, Cesar
    NEUROCOMPUTING, 2021, 464 : 432 - 437
  • [29] Testing of algorithms for anomaly detection in Big data using apache spark
    Lighari, Sheeraz Niaz
    Hussain, Dil Muhammad Akbar
    2017 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2017, : 97 - 100
  • [30] PRISPARK: Differential Privacy Enforcement for Big Data Computing in Apache Spark
    Li, Shuailou
    Wen, Yu
    Xue, Tao
    Wang, Zhaoyang
    Wu, Yanna
    Meng, Dan
    2023 42ND INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, SRDS 2023, 2023, : 93 - 106