Big Data Forensics on Apache Kafka

被引：0

作者：

Mager, Thomas

机构：

来源：

INFORMATION SYSTEMS SECURITY, ICISS 2023 | 2023年 / 14424卷

关键词：

RECOVERY;

D O I：

10.1007/978-3-031-49099-6_3

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

There is a growing demand for information exchange in the age of the Internet of Things. One common scenario involves transferring data from distributed devices in the field to central servers or cloud environments. However, little research has been done on the possibilities for forensic investigation of supporting infrastructure such as Apache Kafka, which plays a crucial role in modern big data architectures. In this paper, we present our work on the forensic investigation of Apache Kafka. We use methodologies of reverse engineering to infer the data formats that Apache Kafka uses server-side. The results help us to implement a new module that is able to read Apache Kafka log files. An investigator can load the module in the open-source forensic platform "Autopsy". We highlight possibilities and limitations regarding encryption and data retention in Apache Kafka and suggest to store data decentralized when it comes to sensitive data. As a result of these measures, applications become more resilient to attacks and are able to provide increased security, ethical standards, and freedom for the application users. This can be a unique selling point in future data driven applications.

引用

页码：42 / 56

页数：15

共 50 条

[41] Big Data Optimisation Among RDDs Persistence in Apache Spark
Aziz, Khadija
Zaidouni, Dounia
Bellafkih, Mostafa
[J]. BIG DATA, CLOUD AND APPLICATIONS, BDCA 2018, 2018, 872 : 29 - 40
[42] Digital Forensics in the Age of Big Data: Challenges, Approaches, and Opportunities
Zawoad, Shams
Hasan, Ragib
[J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1320 - 1325
[43] Optimization of Multiple Queries for Big Data with Apache Hadoop/Hive
Garg, Varun
[J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 938 - 941
[44] On Scalability of Distributed Machine Learning with Big Data on Apache Spark
Hai, Ameen Abdel
Forouraghi, Babak
[J]. BIG DATA - BIGDATA 2018, 2018, 10968 : 209 - 219
[45] Apache Spark a Big Data Analytics Platform for Smart Grid
Shyam, R.
Ganesh, Bharathi H. B.
Kumar, Sachin S.
Poornachandran, Prabaharan
Soman, K. P.
[J]. SMART GRID TECHNOLOGIES (ICSGT- 2015), 2015, 21 : 171 - 178
[46] Big Data Analytics for the ATLAS EventIndex Project with Apache Spark
Casani, Alvaro Fernandez
Montoro, Carlos Garcia
de la Hoz, Santiago Gonzalez
Salt, Jose
Sanchez, Javier
Perez, Miguel Villaplana
[J]. COMPUTATIONAL AND MATHEMATICAL METHODS, 2023, 2023
[47] Big Data Network Flow Processing Using Apache Spark
Jerabek, Kamil
Rysavy, Ondrej
[J]. PROCEEDINGS OF THE 6TH CONFERENCE ON THE ENGINEERING OF COMPUTER BASED SYSTEMS (ECBS 2019), 2020,
[48] MaRe: Processing Big Data with application containers on Apache Spark
Capuccini, Marco
Dahlo, Martin
Toor, Salman
Spjuth, Ola
[J]. GIGASCIENCE, 2020, 9 (05):
[49] Apache Spark Methods and Techniques in Big Data-A Review
Sahana, H. P.
Sanjana, M. S.
Muddasir, N. Mohammed
Vidyashree, K. P.
[J]. INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 721 - 726
[50] SparkJNI: A Toolchain for Hardware Accelerated Big Data Apache Spark
Voicu, Tudor Alexandru
Al-Ars, Zaid
[J]. 2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019), 2019, : 152 - 157

← 1 2 3 4 5 →