Data discovery method for Extract-Transform-Load

被引:0
|
作者
Madhikermi, Manik [1 ]
Framling, Kary [1 ]
机构
[1] Aalto Univ, Sch Sci, POB 15400, FI-00076 Espoo, Finland
基金
欧盟地平线“2020”;
关键词
ETL; Database; Trigger; Reverse Engineering; Data Warehouse; Information System; Information Retrieval; Process Mapping; Data Discovery; MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information Systems (ISs) are fundamental to streamline operations and support processes of any modern enterprise. Being able to perform analytics over the data managed in various enterprise ISs is becoming increasingly important for organisational growth. Extract, Transform, and Load (ETL) are the necessary pre-processing steps of any data mining activity. Due to the complexity of modern IS, extracting data is becoming increasingly complicated and time-consuming. In order to ease the process, this paper proposes a methodology and a pilot implementation, that aims to simplify data extraction process by leveraging the end-users' knowledge and understanding of the specific IS. This paper first provides a brief introduction and the current state of the art regarding existing ETL process and techniques. Then, it explains in details the proposed methodology. Finally, test results of typical data-extraction tasks from four commercial ISs are reported.
引用
收藏
页码:174 / 181
页数:8
相关论文
共 50 条
  • [1] Data Model Logger - Data Discovery for Extract-Transform-Load
    Madhikermi, Manik
    Buda, Andrea
    Dave, Bhargav
    Framling, Kary
    [J]. 2017 19TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS (HPCC) / 2017 15TH IEEE INTERNATIONAL CONFERENCE ON SMART CITY (SMARTCITY) / 2017 3RD IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (DSS), 2017, : 629 - 630
  • [2] UTILIZING EXTRACT-TRANSFORM-LOAD TO ENHANCE METADATA-DRIVEN DATA DISCOVERY
    Careem, Mifan
    Karunarathne, Damith
    [J]. 4TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGY AND ENGINEERING (ICSTE 2012), 2012, : 301 - +
  • [3] Extract-Transform-Load for Video Streams
    Kossmann, Ferdi
    Wu, Ziniu
    Lai, Eugenie
    Tatbul, Nesime
    Cao, Lei
    Kraska, Tim
    Madden, Sam
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (09): : 2302 - 2315
  • [4] Testing Extract-Transform-Load Process in Data Warehouse Systems
    Homayouni, Hajar
    [J]. 2018 29TH IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW), 2018, : 158 - 161
  • [5] Integrating Big Data: A Semantic Extract-Transform-Load Framework
    Bansal, Srividya K.
    Kagemann, Sebastian
    [J]. COMPUTER, 2015, 48 (03) : 42 - 50
  • [6] A Survey of Extract-Transform-Load Technology
    Vassiliadis, Panos
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2009, 5 (03) : 1 - 27
  • [7] Research on data center construction based on extract-transform-load (ETL)
    Cai, Li
    Su, Jianying
    [J]. AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (03): : 947 - 949
  • [8] A Grammar for Reproducible and Painless Extract-Transform-Load Operations on Medium Data
    Baumer, Benjamin S.
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (02) : 256 - 264
  • [9] An enhanced Extract-Transform-Load system for migrating data in Telecom Billing
    Agrawal, Himanshu
    Chafle, Girish
    Goyal, Sunil
    Mittal, Sumit
    Mukherjea, Sougata
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1277 - 1286
  • [10] SETL: A programmable semantic extract-transform-load framework for semantic data warehouses
    Deb Nath, Rudra Pratap
    Hose, Katja
    Pedersen, Torben Bach
    Romero, Oscar
    [J]. INFORMATION SYSTEMS, 2017, 68 : 17 - 43