Big Data Forensics on Apache Kafka

被引:0
|
作者
Mager, Thomas
机构
来源
关键词
RECOVERY;
D O I
10.1007/978-3-031-49099-6_3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
There is a growing demand for information exchange in the age of the Internet of Things. One common scenario involves transferring data from distributed devices in the field to central servers or cloud environments. However, little research has been done on the possibilities for forensic investigation of supporting infrastructure such as Apache Kafka, which plays a crucial role in modern big data architectures. In this paper, we present our work on the forensic investigation of Apache Kafka. We use methodologies of reverse engineering to infer the data formats that Apache Kafka uses server-side. The results help us to implement a new module that is able to read Apache Kafka log files. An investigator can load the module in the open-source forensic platform "Autopsy". We highlight possibilities and limitations regarding encryption and data retention in Apache Kafka and suggest to store data decentralized when it comes to sensitive data. As a result of these measures, applications become more resilient to attacks and are able to provide increased security, ethical standards, and freedom for the application users. This can be a unique selling point in future data driven applications.
引用
收藏
页码:42 / 56
页数:15
相关论文
共 50 条
  • [41] Big Data Optimisation Among RDDs Persistence in Apache Spark
    Aziz, Khadija
    Zaidouni, Dounia
    Bellafkih, Mostafa
    [J]. BIG DATA, CLOUD AND APPLICATIONS, BDCA 2018, 2018, 872 : 29 - 40
  • [42] Digital Forensics in the Age of Big Data: Challenges, Approaches, and Opportunities
    Zawoad, Shams
    Hasan, Ragib
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1320 - 1325
  • [43] Optimization of Multiple Queries for Big Data with Apache Hadoop/Hive
    Garg, Varun
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 938 - 941
  • [44] On Scalability of Distributed Machine Learning with Big Data on Apache Spark
    Hai, Ameen Abdel
    Forouraghi, Babak
    [J]. BIG DATA - BIGDATA 2018, 2018, 10968 : 209 - 219
  • [45] Apache Spark a Big Data Analytics Platform for Smart Grid
    Shyam, R.
    Ganesh, Bharathi H. B.
    Kumar, Sachin S.
    Poornachandran, Prabaharan
    Soman, K. P.
    [J]. SMART GRID TECHNOLOGIES (ICSGT- 2015), 2015, 21 : 171 - 178
  • [46] Big Data Analytics for the ATLAS EventIndex Project with Apache Spark
    Casani, Alvaro Fernandez
    Montoro, Carlos Garcia
    de la Hoz, Santiago Gonzalez
    Salt, Jose
    Sanchez, Javier
    Perez, Miguel Villaplana
    [J]. COMPUTATIONAL AND MATHEMATICAL METHODS, 2023, 2023
  • [47] Big Data Network Flow Processing Using Apache Spark
    Jerabek, Kamil
    Rysavy, Ondrej
    [J]. PROCEEDINGS OF THE 6TH CONFERENCE ON THE ENGINEERING OF COMPUTER BASED SYSTEMS (ECBS 2019), 2020,
  • [48] MaRe: Processing Big Data with application containers on Apache Spark
    Capuccini, Marco
    Dahlo, Martin
    Toor, Salman
    Spjuth, Ola
    [J]. GIGASCIENCE, 2020, 9 (05):
  • [49] Apache Spark Methods and Techniques in Big Data-A Review
    Sahana, H. P.
    Sanjana, M. S.
    Muddasir, N. Mohammed
    Vidyashree, K. P.
    [J]. INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 721 - 726
  • [50] SparkJNI: A Toolchain for Hardware Accelerated Big Data Apache Spark
    Voicu, Tudor Alexandru
    Al-Ars, Zaid
    [J]. 2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019), 2019, : 152 - 157