Data discovery method for Extract-Transform-Load

被引:0
|
作者
Madhikermi, Manik [1 ]
Framling, Kary [1 ]
机构
[1] Aalto Univ, Sch Sci, POB 15400, FI-00076 Espoo, Finland
基金
欧盟地平线“2020”;
关键词
ETL; Database; Trigger; Reverse Engineering; Data Warehouse; Information System; Information Retrieval; Process Mapping; Data Discovery; MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information Systems (ISs) are fundamental to streamline operations and support processes of any modern enterprise. Being able to perform analytics over the data managed in various enterprise ISs is becoming increasingly important for organisational growth. Extract, Transform, and Load (ETL) are the necessary pre-processing steps of any data mining activity. Due to the complexity of modern IS, extracting data is becoming increasingly complicated and time-consuming. In order to ease the process, this paper proposes a methodology and a pilot implementation, that aims to simplify data extraction process by leveraging the end-users' knowledge and understanding of the specific IS. This paper first provides a brief introduction and the current state of the art regarding existing ETL process and techniques. Then, it explains in details the proposed methodology. Finally, test results of typical data-extraction tasks from four commercial ISs are reported.
引用
收藏
页码:174 / 181
页数:8
相关论文
共 50 条
  • [21] Extract, Transform, Load Module in SOLAP for Indonesia Agricultural Commodity
    Sitanggang, Imas Sukaesih
    Trisminingsih, Rina
    Fuady, Fauzan
    Khotimah, Husnul
    [J]. PROCEEDINGS OF 2018 3RD INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET 2018), 2018, : 101 - 105
  • [22] Extract, transform, load framework for the conversion of health databases to OMOP
    Quiroz, Juan C.
    Chard, Tim
    Sa, Zhisheng
    Ritchie, Angus
    Jorm, Louisa
    Gallego, Blanca
    [J]. PLOS ONE, 2022, 17 (04):
  • [23] Scheduling of Extract, Transform, and Load (ETL) Procedures with Genetic Algorithm
    Vrbanic, Vedran
    Kalpic, Damir
    [J]. INTERNATIONAL JOURNAL OF BUSINESS ANALYTICS, 2015, 2 (03) : 33 - 46
  • [24] Using Semantic Web Technologies to Improve the Extract Transform Load Model
    Mahmoud, Amena
    Shams, Mahmoud Y.
    Elzeki, O. M.
    Awad, Nancy Awadallah
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (02): : 2711 - 2726
  • [25] Data Mining Approach for Short Term Load Forecasting by Combining Wavelet Transform and Group Method of Data Handling (WGMDH)
    Yuniarti, Trisna
    Surjandari, Isti
    Muslim, Erlinda
    Laoh, Enrico
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON SCIENCE IN INFORMATION TECHNOLOGY (ICSITECH), 2017, : 53 - 58
  • [26] Optimizing Database Load and Extract for Big Data Era
    Sridhar, K. T.
    Sakkeer, M. A.
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT II, 2014, 8422 : 503 - 512
  • [27] Developing a Pattern Discovery Model for Host Load Data
    Gu, Zhuoer
    Chang, Cheng
    He, Ligang
    Li, Kenli
    [J]. 2014 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, : 265 - 271
  • [28] Leveraging Big Data to Transform Target Selection and Drug Discovery
    Chen, B.
    Butte, A. J.
    [J]. CLINICAL PHARMACOLOGY & THERAPEUTICS, 2016, 99 (03) : 285 - 297
  • [29] An Efficient Method for Motif Discovery in CPU Host Load
    Gu, Zhuoer
    He, Ligang
    Chang, Cheng
    Sun, Jianhua
    Chen, Hao
    Huang, Chenlin
    [J]. 2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1027 - 1034
  • [30] Load balancing route discovery method based on AODV
    Lee, Hyun-Seok
    Tu, Nguyen Thi Thanh
    Heo, Jung-Seok
    [J]. IFOST 2006: 1ST INTERNATIONAL FORUM ON STRATEGIC TECHNOLOGY, PROCEEDINGS: E-VEHICLE TECHNOLOGY, 2006, : 374 - +