HBelt: Integrating an Incremental ETL Pipeline with a Big Data Store for Real-Time Analytics

被引:1
|
作者
Qu, Weiping [1 ]
Shankar, Sahana [1 ]
Ganza, Sandy [1 ]
Dessloch, Stefan [1 ]
机构
[1] Univ Kaiserslautern, Heterogeneous Informat Syst Grp, D-67663 Kaiserslautern, Germany
关键词
SYSTEM;
D O I
10.1007/978-3-319-23135-8_9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper demonstrates a system called HBelt which tightly integrates a distributed, key-value data store HBase with an extended ETL engine Kettle. The objective is to provide HBase tables with real-time data freshness in an efficient manner. A distributed ETL engine is extended and integrated as an overlay of HBase. Meanwhile, we extend this ETL engine with the capability of processing incremental ETL flows in a pipelined fashion. Delta batches are defined by the MVCC component in HBase to flush the incremental ETL pipeline for multiple concurrent read requests. Experimental results show that high query throughput can be achieved in HBelt for real-time analytics.
引用
收藏
页码:123 / 137
页数:15
相关论文
共 50 条
  • [21] Big Data Analytics Architecture for Real-Time Traffic Control
    Amini, Sasan
    Gerostathopoulos, Ilias
    Prehofer, Christian
    2017 5TH IEEE INTERNATIONAL CONFERENCE ON MODELS AND TECHNOLOGIES FOR INTELLIGENT TRANSPORTATION SYSTEMS (MT-ITS), 2017, : 710 - 715
  • [22] An ETL Strategy for Real-Time Data Warehouse
    Zhou, Haihe
    Yang, Dingyu
    Xu, Yang
    PRACTICAL APPLICATIONS OF INTELLIGENT SYSTEMS, 2011, 124 : 329 - +
  • [23] Big data analytics on social networks for real-time depression detection
    Angskun, Jitimon
    Tipprasert, Suda
    Angskun, Thara
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [24] Engineering Scalable Distributed Services for Real-Time Big Data Analytics
    Jambi, Sahar
    Anderson, Kenneth M.
    2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 131 - 140
  • [25] Big data analytics on social networks for real-time depression detection
    Jitimon Angskun
    Suda Tipprasert
    Thara Angskun
    Journal of Big Data, 9
  • [26] Big Data Analytics of Geosocial Media for Planning and Real-Time Decisions
    Rathore, M. Mazhar
    Paul, Anand
    Ahmad, Awais
    Imran, Muhammad
    Guizani, Mohsen
    2017 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2017,
  • [27] A Column Store Engine for Real-Time Streaming Analytics
    Skidanov, Alex
    Papito, Anders J.
    Prout, Adam
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1287 - 1297
  • [28] Big Data Analytics for Real Time Dispatch
    Mogra, Himanshu
    Segu, SaiNikhil
    DeLong, James
    Canales-Vaschy, Remy
    Ramakrishnan, Srikanth
    Sridharan, Sriram
    Penumutchu, Srikanth
    2024 35TH ANNUAL SEMI ADVANCED SEMICONDUCTOR MANUFACTURING CONFERENCE, ASMC, 2024,
  • [29] Improving hearing healthcare with Big Data analytics of real-time hearing aid data
    Christensen, Jeppe H.
    Pontoppidan, Niels H.
    Anisetti, Marco
    Bellandi, Valerio
    Cremonini, Marco
    2019 IEEE WORLD CONGRESS ON SERVICES (IEEE SERVICES 2019), 2019, : 307 - 313
  • [30] Using a Rich Context Model for Real-Time Big Data Analytics in Twitter
    Sotsenko, Alisa
    Jansen, Marc
    Milrad, Marcelo
    Rana, Juwel
    2016 IEEE 4TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD WORKSHOPS (FICLOUDW), 2016, : 228 - 233