Big Data Network Flow Processing Using Apache Spark

被引:0
|
作者
Jerabek, Kamil [1 ]
Rysavy, Ondrej [1 ]
机构
[1] Brno Univ Technol, Brno, Czech Republic
关键词
Big Data; Network flows; Apache Spark; Cassandra; Apache Ignite;
D O I
10.1145/3352700.3352709
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The increasing amount of traffic flows captured as a part of network monitoring activities makes the analysis more complicated. One of the goals for network traffic analysis is to identify malicious communication. In the paper, we present a new system for big data network flow classification and clustering. The proposed system is based on the popular big data engines such as Apache Spark and Apache Ignite. The conducted experiments demonstrate the feasibility of the proposed approach and show the possible scalability.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] An Investigative Testing of Structured and Unstructured Data Formats in Big Data Application Using Apache Spark
    Rajesh Kumar Pallamala
    Paul Rodrigues
    Wireless Personal Communications, 2022, 122 : 603 - 620
  • [32] An Investigative Testing of Structured and Unstructured Data Formats in Big Data Application Using Apache Spark
    Pallamala, Rajesh Kumar
    Rodrigues, Paul
    WIRELESS PERSONAL COMMUNICATIONS, 2022, 122 (01) : 603 - 620
  • [33] Identifying the potential of Near Data Processing for Apache Spark
    Awan, Ahsan Javed
    Ohara, Moriyoshi
    Ayguade, Eduard
    Ishizaki, Kazuaki
    Brorsson, Mats
    Vlassov, Vladimir
    MEMSYS 2017: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2017, : 60 - 67
  • [34] Big Data Optimisation Among RDDs Persistence in Apache Spark
    Aziz, Khadija
    Zaidouni, Dounia
    Bellafkih, Mostafa
    BIG DATA, CLOUD AND APPLICATIONS, BDCA 2018, 2018, 872 : 29 - 40
  • [35] Processing large-scale data with Apache Spark
    Ko, Seyoon
    Won, Joong-Ho
    KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (06) : 1077 - 1094
  • [36] Apache Spark a Big Data Analytics Platform for Smart Grid
    Shyam, R.
    Ganesh, Bharathi H. B.
    Kumar, Sachin S.
    Poornachandran, Prabaharan
    Soman, K. P.
    SMART GRID TECHNOLOGIES (ICSGT- 2015), 2015, 21 : 171 - 178
  • [37] On Scalability of Distributed Machine Learning with Big Data on Apache Spark
    Hai, Ameen Abdel
    Forouraghi, Babak
    BIG DATA - BIGDATA 2018, 2018, 10968 : 209 - 219
  • [38] Big Data Analytics for the ATLAS EventIndex Project with Apache Spark
    Casani, Alvaro Fernandez
    Montoro, Carlos Garcia
    de la Hoz, Santiago Gonzalez
    Salt, Jose
    Sanchez, Javier
    Perez, Miguel Villaplana
    COMPUTATIONAL AND MATHEMATICAL METHODS, 2023, 2023
  • [39] Apache Spark Methods and Techniques in Big Data-A Review
    Sahana, H. P.
    Sanjana, M. S.
    Muddasir, N. Mohammed
    Vidyashree, K. P.
    INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 721 - 726
  • [40] Distributed Data Processing on Microcomputers with Ascheduler and Apache Spark
    Korkhov, Vladimir
    Gankevich, Ivan
    Iakushkin, Oleg
    Gushchanskiy, Dmitry
    Khmel, Dmitry
    Ivashchenko, Andrey
    Pyayt, Alexander
    Zobnin, Sergey
    Loginov, Alexander
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2017, PT V, 2017, 10408 : 387 - 398