Code Generation in Serializers and Comparators of Apache Flink

被引:2
|
作者
Horvath, Gabor [1 ]
Pataki, Norbert [1 ]
Balassi, Marton [2 ]
机构
[1] Eotvos Lorand Univ, Dept Programming Languages & Compilers, Fac Informat, Budapest, Hungary
[2] Hungarian Acad Sci, Informat Lab, Inst Comp Sci & Control, Budapest, Hungary
关键词
!text type='Java']Java[!/text; Janino; code generation; big data; Flink;
D O I
10.1145/3098572.3098579
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
There is a shift in the Big Data world. Applications used to be I/O bound. InfiniBand, SSDs reduced the I/O overhead and more sophisticated algorithms were developed. CPU became a bottleneck for some applications. Using state of the art CPUs, reduced CPU usage can lead to reduced electricity costs even when an application is I/O bound. Apache Flink is an open source framework for processing streams of data and batch jobs. It is using serialization for wide variety of purposes. Not only for sending data over the network, saving it to the hard disk, or for fault tolerance, but also some of the operators can work on the serialized representation of the data instead of Java objects. This approach can improve the performance significantly. Flink has a custom serialization method that enables operators to work on the serialized formats. Currently, Apache Flink uses reflection to serialize Plain Old Java Objects (POJOs). Reflection in Java is notoriously slow. Moreover, the structure of the code is harder to optimize for the JIT compiler. As a Google Summer of Code project in 2016, we implemented code generation for serializers and comparators for POJOs to improve the performance of Apache Flink. Flink has a delicate type system which provides us with lots of information about the types that need to be serialized. Using this information it is possible to generate specialized code with great performance. We achieved more than 6X performance improvement in the serialization which was a 20% overall improvement.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Apache Flink in current research
    Rabl, Tilmann
    Traub, Jonas
    Katsifodimos, Asterios
    Markl, Volker
    [J]. IT-INFORMATION TECHNOLOGY, 2016, 58 (04): : 157 - 165
  • [2] Apache Flink: Stream Analytics at Scale
    Katsifodimos, Asterios
    Schelter, Sebastian
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING WORKSHOP (IC2EW), 2016, : 193 - 193
  • [3] Adaptive Distributed Partitioning in Apache Flink
    Toliopoulos, Theodoros
    Gounaris, Anastasios
    [J]. 2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW 2020), 2020, : 127 - 132
  • [4] Q-Flink: A QoS-Aware Controller for Apache Flink
    HoseinyFarahabady, M. Reza
    Jannesari, Ali
    Taheri, Javid
    Bao, Wei
    Zomaya, Albert Y.
    Tari, Zahir
    [J]. 2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 629 - 638
  • [5] SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink
    Ceballos, Oscar
    Ramirez Restrepo, Carlos Alberto
    Constanza Pabon, Maria
    Castillo, Andres M.
    Corcho, Oscar
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (15):
  • [6] Towards autoscaling of Apache Flink jobs
    Varga, Balazs
    Balassi, Marton
    Kiss, Attila
    [J]. ACTA UNIVERSITATIS SAPIENTIAE INFORMATICA, 2021, 13 (01) : 39 - 59
  • [7] An Efficient Topology Refining Scheme for Apache Flink
    Hanif, Muhammad
    Lee, Choonhwa
    [J]. 2018 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2018, : 766 - 770
  • [8] HYAS: Hybrid Autoscaler Agent for Apache Flink
    Zafeirakopoulos, Alexandros Nikolaos
    Petrakis, Euripides G. M.
    [J]. WEB ENGINEERING, ICWE 2023, 2023, 13893 : 34 - 48
  • [9] On the Usability of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science
    Akil, Bilal
    Zhou, Ying
    Roehm, Uwe
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 303 - 310
  • [10] BigBench workload executed by using Apache Flink
    Bergamaschi, Sonia
    Gagliardelli, Luca
    Simonini, Giovanni
    Zhu, Song
    [J]. 27TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING, FAIM2017, 2017, 11 : 695 - 702