FlowKV: A Semantic-Aware Store for Large-Scale State Management of Stream Processing Engines

被引:1
|
作者
Lee, Gyewon [1 ,2 ]
Maeng, Jaewoo [2 ]
Park, Jinsol [2 ]
Seo, Jangho [3 ]
Cho, Haeyoon [2 ]
Yang, Youngseok [4 ]
Um, Taegeon [5 ]
Lee, Jongsung [2 ,6 ]
Lee, Jae W. [2 ]
Chun, Byung-Gon [1 ,2 ]
机构
[1] FriendliAI, Seoul, South Korea
[2] Seoul Natl Univ, Seoul, South Korea
[3] NAVER Corp, Seongnam, South Korea
[4] Mirny Inc, Seoul, South Korea
[5] Samsung Res, Seoul, South Korea
[6] Samsung Elect, Suwon, South Korea
关键词
stream processing; KV store; state management; PERFORMANCE;
D O I
10.1145/3552326.3567493
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We propose FlowKV, a persistent store tailored for large-scale state management of streaming applications. Unlike existing KV stores, FlowKV leverages information from stream processing engines by taking a principled approach toward exploiting information about how and when the applications access data. FlowKV categorizes data access patterns of window operations according to how window boundaries are set and how tuples inside a window are aggregated, and deploys customized in-memory and on-disk data structures optimized for each pattern. In addition, FlowKV takes window metadata as explicit arguments of read and write methods to predict the moment when a window is read, and then loads the tuples of windows in batches from storage ahead of time. Using the NEXMark benchmark as workload, our experiments show that Apache Flink on FlowKV outperforms Flink on RocksDB or Faster with up to 4.12x throughput gain.
引用
收藏
页码:768 / 783
页数:16
相关论文
共 50 条
  • [31] Leveraging State-of-the-Art Engines for Large-Scale Data Analysis in High Energy Physics
    Vincenzo Eduardo Padulano
    Ivan Donchev Kabadzhov
    Enric Tejedor Saavedra
    Enrico Guiraud
    Pedro Alonso-Jordá
    Journal of Grid Computing, 2023, 21
  • [32] Leveraging State-of-the-Art Engines for Large-Scale Data Analysis in High Energy Physics
    Padulano, Vincenzo Eduardo
    Kabadzhov, Ivan Donchev
    Saavedra, Enric Tejedor
    Guiraud, Enrico
    Alonso-Jorda, Pedro
    JOURNAL OF GRID COMPUTING, 2023, 21 (01)
  • [33] REMO: Resource-Aware Application State Monitoring for Large-Scale Distributed Systems
    Meng, Shicong
    Kashyap, Srinivas R.
    Venkatramani, Chitra
    Liu, Ling
    2009 29TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, 2009, : 248 - +
  • [34] A survey on data analysis on large-Scale wireless networks: online stream processing, trends, and challenges
    Medeiros, Dianne S., V
    Cunha Neto, Helio N.
    Lopez, Martin Andreoni
    Magalhaes, Luiz Claudio S.
    Fernandes, Natalia C.
    Vieira, Alex B.
    Silva, Edelberto F.
    Mattos, Diogo M. F.
    JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2020, 11 (01)
  • [35] Piveau: A Large-Scale Open Data Management Platform Based on Semantic Web Technologies
    Kirstein, Fabian
    Stefanidis, Kyriakos
    Dittwald, Benjamin
    Dutkowski, Simon
    Urbanek, Sebastian
    Hauswirth, Manfred
    SEMANTIC WEB (ESWC 2020), 2020, 12123 : 648 - 664
  • [36] LargeGraph: An Efficient Dependency-Aware GPU-Accelerated Large-Scale Graph Processing
    Zhang, Yu
    Peng, Da
    Liao, Xiaofei
    Jin, Hai
    Liu, Haikun
    Gu, Lin
    He, Bingsheng
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (04)
  • [37] The state as a large-scale aggregator: statist neoliberalism and waste management in Portugal
    Evans, Ana Maria
    Matos, Pedro Verga
    Santos, Vitor
    CONTEMPORARY POLITICS, 2019, 25 (03) : 353 - 372
  • [38] Management of Large-Scale Transformation Programs: State of the Practice and Future Potential
    Lahrmann, Gerrit
    Labusch, Nils
    Winter, Robert
    Uhl, Axel
    TRENDS IN ENTERPRISE ARCHITECTURE RESEARCH AND PRACTICE-DRIVEN RESEARCH ON ENTERPRISE TRANSFORMATION, 2012, 131 : 253 - 267
  • [39] State of Disaster Science: A Review on Management of Large-Scale Patient Surge
    Baumgartner, Erin T.
    Shea, Sophia Y.
    Stern, Katie L.
    Bambrick, Nora
    Lookadoo, Rachel
    Knieser, Lauren
    Sauer, Lauren M.
    HEALTH SECURITY, 2025, 23 (01) : 9 - 23
  • [40] Annotation-based finite state processing in a large-scale NLP architecture
    Boguraev, BK
    RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING III, 2004, 260 : 61 - 79