Lazy ETL in Action: ETL Technology Dates Scientific Data

被引:1
|
作者
Kargin, Yagiz [1 ]
Ivanova, Milena [2 ]
Zhang, Ying [1 ]
Manegold, Stefan [1 ]
Kersten, Martin [1 ]
机构
[1] CWI Amsterdam, Amsterdam, Netherlands
[2] Netherlands Esci Ctr, Amsterdam, Netherlands
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 12期
基金
欧盟第七框架计划;
关键词
D O I
10.14778/2536274.2536297
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Both scientific data and business data have analytical needs. Analysis takes place after a scientific data warehouse is eagerly filled with all data from external data sources (repositories). This is similar to the initial loading stage of Extract, Transform, and Load (ETL) processes that drive business intelligence. ETL can also help scientific data analysis. However, the initial loading is a time and resource consuming operation. It might not be entirely necessary, e.g. if the user is interested in only a subset of the data. We propose to demonstrate Lazy ETL, a technique to lower costs for initial loading. With it, ETL is integrated into the query processing of the scientific data warehouse. For a query, only the required data items are extracted, transformed, and loaded transparently on-the-fly. The demo is built around concrete implementations of Lazy ETL for seismic data analysis. The seismic data warehouse is ready for query processing, without waiting for long initial loading. The audience fires analytical queries to observe the internal mechanisms and modifications that realize each of the steps; lazy extraction, transformation, and loading.
引用
收藏
页码:1286 / 1289
页数:4
相关论文
共 50 条
  • [1] Instant-On Scientific Data Warehouses Lazy ETL for Data-Intensive Research
    Kargin, Yagiz
    Pirk, Holger
    Ivanova, Milena
    Manegold, Stefan
    Kersten, Martin
    [J]. ENABLING REAL-TIME BUSINESS INTELLIGENCE, VLDB 2012, 2013, 154 : 60 - 75
  • [2] INTEGRATION OF DATA FROM HETEROGENEOUS SOURCES USING ETL TECHNOLOGY
    Macura, Marek
    [J]. COMPUTER SCIENCE-AGH, 2014, 15 (02): : 109 - 132
  • [3] A Technology-Specific Modeling Method for Data ETL Processes
    Deme, Andrea
    Buchmann, Robert Andrei
    [J]. DIGITAL INNOVATION AND ENTREPRENEURSHIP (AMCIS 2021), 2021,
  • [4] ETL of spatial data warehouse
    Research Center of Spatial Information and Digital Engineering, International Software Institute, Wuhan University, 129 Luoyu Road, Wuhan 430079, China
    [J]. Geomatics and Information Science of Wuhan University, 2007, 32 (04) : 362 - 365
  • [5] Automating ETL Process with Scripting Technology
    Radhakrishna, Vangipuram
    SravanKiran, Vangipuram
    Ravikiran, K.
    [J]. 3RD NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2012), 2012,
  • [6] Data Validation in ETL Using TALEND
    Sreemathy, J.
    Priyadharshini, S.
    Radha, K.
    Sangeerna, K.
    Nivetha, G.
    [J]. 2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 1183 - 1186
  • [7] BigDimETL: ETL for Multidimensional Big Data
    Mallek, Hana
    Ghozzi, Faiza
    Teste, Olivier
    Gargouri, Faiez
    [J]. INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA 2016), 2017, 557 : 935 - 944
  • [8] Optimizing ETL processes in data warehouses
    Simitsis, A
    Vassiliadis, P
    Sellis, T
    [J]. ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 564 - 575
  • [9] Nuts and Bolts of ETL in Data Warehouse
    Sachin, Sharma
    Goyal, Sandip Kumar
    Avinash, Sharma
    Kamal, Kumar
    [J]. EMERGING TRENDS IN EXPERT APPLICATIONS AND SECURITY, 2019, 841 : 1 - 9
  • [10] Data Integration in ETL Using TALEND
    Sreemathy, J.
    Joseph, Infant, V
    Nisha, S.
    Prabha, Chaaru, I
    Priya, Gokula R. M.
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 1444 - 1448