SETL: A programmable semantic extract-transform-load framework for semantic data warehouses

被引:22
|
作者
Deb Nath, Rudra Pratap [1 ,2 ]
Hose, Katja [1 ]
Pedersen, Torben Bach [1 ]
Romero, Oscar [2 ]
机构
[1] Aalborg Univ, Aalborg, Denmark
[2] Univ Politecn Cataluna, BarcelonaTech, Barcelona, Spain
关键词
ETL; RDF; Semantic integration; Data warehouse; Semantic-aware; Knowledge base;
D O I
10.1016/j.is.2017.01.005
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this "open world scenario" because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:17 / 43
页数:27
相关论文
共 50 条
  • [1] Integrating Big Data: A Semantic Extract-Transform-Load Framework
    Bansal, Srividya K.
    Kagemann, Sebastian
    [J]. COMPUTER, 2015, 48 (03) : 42 - 50
  • [2] Towards a Semantic Extract-Transform-Load (ETL) framework for Big Data Integration
    Bansal, Srividya K.
    [J]. 2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 521 - 528
  • [3] Data discovery method for Extract-Transform-Load
    Madhikermi, Manik
    Framling, Kary
    [J]. 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON MECHANICAL AND INTELLIGENT MANUFACTURING TECHNOLOGIES (ICMIMT 2019), 2019, : 174 - 181
  • [4] UDP: A Programmable Accelerator for Extract-Transform-Load Workloads and More
    Fang, Yuanwei
    Zou, Chen
    Elmore, Aaron J.
    Chien, Andrew A.
    [J]. 50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 55 - 68
  • [5] Data Model Logger - Data Discovery for Extract-Transform-Load
    Madhikermi, Manik
    Buda, Andrea
    Dave, Bhargav
    Framling, Kary
    [J]. 2017 19TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS (HPCC) / 2017 15TH IEEE INTERNATIONAL CONFERENCE ON SMART CITY (SMARTCITY) / 2017 3RD IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (DSS), 2017, : 629 - 630
  • [6] 4D-SETL A Semantic Data Integration Framework
    de Cesare, Sergio
    Foy, George
    Lycett, Mark
    [J]. PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1 (ICEIS), 2016, : 127 - 134
  • [7] Extract-Transform-Load for Video Streams
    Kossmann, Ferdi
    Wu, Ziniu
    Lai, Eugenie
    Tatbul, Nesime
    Cao, Lei
    Kraska, Tim
    Madden, Sam
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (09): : 2302 - 2315
  • [8] Testing Extract-Transform-Load Process in Data Warehouse Systems
    Homayouni, Hajar
    [J]. 2018 29TH IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW), 2018, : 158 - 161
  • [9] A Survey of Extract-Transform-Load Technology
    Vassiliadis, Panos
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2009, 5 (03) : 1 - 27
  • [10] Research on data center construction based on extract-transform-load (ETL)
    Cai, Li
    Su, Jianying
    [J]. AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (03): : 947 - 949