FlowKV: A Semantic-Aware Store for Large-Scale State Management of Stream Processing Engines

被引:1
|
作者
Lee, Gyewon [1 ,2 ]
Maeng, Jaewoo [2 ]
Park, Jinsol [2 ]
Seo, Jangho [3 ]
Cho, Haeyoon [2 ]
Yang, Youngseok [4 ]
Um, Taegeon [5 ]
Lee, Jongsung [2 ,6 ]
Lee, Jae W. [2 ]
Chun, Byung-Gon [1 ,2 ]
机构
[1] FriendliAI, Seoul, South Korea
[2] Seoul Natl Univ, Seoul, South Korea
[3] NAVER Corp, Seongnam, South Korea
[4] Mirny Inc, Seoul, South Korea
[5] Samsung Res, Seoul, South Korea
[6] Samsung Elect, Suwon, South Korea
关键词
stream processing; KV store; state management; PERFORMANCE;
D O I
10.1145/3552326.3567493
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We propose FlowKV, a persistent store tailored for large-scale state management of streaming applications. Unlike existing KV stores, FlowKV leverages information from stream processing engines by taking a principled approach toward exploiting information about how and when the applications access data. FlowKV categorizes data access patterns of window operations according to how window boundaries are set and how tuples inside a window are aggregated, and deploys customized in-memory and on-disk data structures optimized for each pattern. In addition, FlowKV takes window metadata as explicit arguments of read and write methods to predict the moment when a window is read, and then loads the tuples of windows in batches from storage ahead of time. Using the NEXMark benchmark as workload, our experiments show that Apache Flink on FlowKV outperforms Flink on RocksDB or Faster with up to 4.12x throughput gain.
引用
收藏
页码:768 / 783
页数:16
相关论文
共 50 条
  • [41] Partition Selection for Large-Scale Data Management Using KNN Join Processing
    Hu, Yue
    Peng, Ge
    Wang, Zehua
    Cui, Yanrong
    Qin, Hang
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [42] A Modelling, Simulation, and Validation Framework for the Distributed Management of Large-scale Processing Systems
    Nazari, Shaghayegh
    Sonntag, Christian
    Stojanovski, Goran
    Engell, Sebastian
    12TH INTERNATIONAL SYMPOSIUM ON PROCESS SYSTEMS ENGINEERING (PSE) AND 25TH EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING (ESCAPE), PT A, 2015, 37 : 269 - 274
  • [43] The Feasibility of Organic Nutrient Management in Large-scale Sweet Corn Production for Processing
    Johnson, Heidi J.
    Colquhoun, Jed B.
    Bussan, Alvin J.
    HORTTECHNOLOGY, 2012, 22 (01) : 25 - 36
  • [44] A research agenda for query processing in large-scale Peer Data Management Systems
    Hose, Katja
    Roth, Armin
    Zeitz, Andre
    Sattler, Kai-Uwe
    Naumann, Felix
    INFORMATION SYSTEMS, 2008, 33 (7-8) : 597 - 610
  • [45] Large-scale high-dimensional nearest neighbor search using flash memory with in-store processing
    Jun, Sang-Woo
    Chung, Chanwoo
    Arvind
    2015 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2015,
  • [46] HFA-Net: hybrid feature-aware network for large-scale point cloud semantic segmentation
    Wen, Changji
    Zhang, Long
    Ren, Junfeng
    Hong, Rundong
    Li, Chenshuang
    Yang, Ce
    Lv, Yanfeng
    Chen, Hongbing
    Yang, Ning
    ARTIFICIAL INTELLIGENCE REVIEW, 2025, 58 (04)
  • [47] DPANet: Position-aware feature encoding and decoding for accurate large-scale point cloud semantic segmentation
    Zhao, Haoying
    Zhou, Aimin
    IET COMPUTER VISION, 2024,
  • [48] Study on big data center traffic management based on the seperation of large-scale data stream
    Park, Hyoung Woo
    Yeo, Il Yeon
    Lee, Jongsuk Ruth
    Jang, Haengjin
    2013 SEVENTH INTERNATIONAL CONFERENCE ON INNOVATIVE MOBILE AND INTERNET SERVICES IN UBIQUITOUS COMPUTING (IMIS 2013), 2013, : 591 - 594
  • [49] Energy-Aware Resource Management and Green Energy Use for Large-Scale Datacenters: A Survey
    Wang, Xiaoying
    Liu, Xiaojing
    Fan, Lihua
    Huang, Jianqiang
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSAIT 2013), 2014, 255 : 555 - 563
  • [50] Multi-Agent Context-Aware Dynamic-Scheduling for Large-scale Processing Networks
    Qu, Shuhui
    Chen, Yirong
    Jasperneite, Juergen
    Lepech, Michael D.
    Wang, Jie
    2020 25TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2020, : 1209 - 1212