FlowKV: A Semantic-Aware Store for Large-Scale State Management of Stream Processing Engines

被引:1
|
作者
Lee, Gyewon [1 ,2 ]
Maeng, Jaewoo [2 ]
Park, Jinsol [2 ]
Seo, Jangho [3 ]
Cho, Haeyoon [2 ]
Yang, Youngseok [4 ]
Um, Taegeon [5 ]
Lee, Jongsung [2 ,6 ]
Lee, Jae W. [2 ]
Chun, Byung-Gon [1 ,2 ]
机构
[1] FriendliAI, Seoul, South Korea
[2] Seoul Natl Univ, Seoul, South Korea
[3] NAVER Corp, Seongnam, South Korea
[4] Mirny Inc, Seoul, South Korea
[5] Samsung Res, Seoul, South Korea
[6] Samsung Elect, Suwon, South Korea
关键词
stream processing; KV store; state management; PERFORMANCE;
D O I
10.1145/3552326.3567493
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We propose FlowKV, a persistent store tailored for large-scale state management of streaming applications. Unlike existing KV stores, FlowKV leverages information from stream processing engines by taking a principled approach toward exploiting information about how and when the applications access data. FlowKV categorizes data access patterns of window operations according to how window boundaries are set and how tuples inside a window are aggregated, and deploys customized in-memory and on-disk data structures optimized for each pattern. In addition, FlowKV takes window metadata as explicit arguments of read and write methods to predict the moment when a window is read, and then loads the tuples of windows in batches from storage ahead of time. Using the NEXMark benchmark as workload, our experiments show that Apache Flink on FlowKV outperforms Flink on RocksDB or Faster with up to 4.12x throughput gain.
引用
收藏
页码:768 / 783
页数:16
相关论文
共 50 条
  • [1] ObjectBook Construction for Large-Scale Semantic-Aware Image Retrieval
    Zhang, Shiliang
    Tian, Qi
    Huang, Qingming
    Gao, Wen
    2011 IEEE 13TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2011,
  • [2] A distress call:: Needing tools to large-scale semantic-aware agent systems
    Elci, Atilla
    COMPSAC 2007: THE THIRTY-FIRST ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, VOL II, PROCEEDINGS, 2007, : 186 - 186
  • [3] Semantic-Aware Visual Abstraction of Large-Scale Social Media Data With Geo-Tags
    Zhou, Zhiguang
    Zhang, Xinlong
    Zhou, Xiaoyun
    Liu, Yuhua
    IEEE ACCESS, 2019, 7 : 114851 - 114861
  • [4] An efficient social-like semantic-aware service discovery mechanism for large-scale Internet of Things
    Xia, Hui
    Hui, Chun-qiang
    Xiao, Fu
    Cheng, Xiang-guo
    Pan, Zhen-kuan
    COMPUTER NETWORKS, 2019, 152 : 210 - 220
  • [5] Semantic-Aware Jointed Coding and Routing Design in Large-Scale Satellite Networks: A Deep Learning Approach
    Gao, Ronghao
    Xu, Yunlai
    Li, Han
    Zhang, Qinyu
    Yang, Zhihua
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, : 5415 - 5429
  • [6] Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines
    Del Monte, Bonaventura
    Zeuch, Steffen
    Rabl, Tilmann
    Markl, Volker
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 2471 - 2486
  • [7] Improving large-scale search engines with semantic annotations
    Fuentes-Lorenzo, Damaris
    Fernandez, Norberto
    Fisteus, Jesus A.
    Sanchez, Luis
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (06) : 2287 - 2296
  • [8] SANE: Semantic-Aware Namespace in Ultra-Large-Scale File Systems
    Hua, Yu
    Jiang, Hong
    Zhu, Yifeng
    Feng, Dan
    Xu, Lei
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (05) : 1328 - 1338
  • [9] Semantic tagging for large-scale content management
    Chen, Liming
    Roberts, Craig
    PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, : 478 - 481
  • [10] Towards Large-Scale Graph Stream Processing Platform
    Suzumura, Toyotaro
    Nishii, Shunsuke
    Ganse, Masaru
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 1321 - 1326