A UML based approach for modeling ETL processes in data warehouses

被引:0
|
作者
Trujillo, J [1 ]
Luján-Mora, S [1 ]
机构
[1] Univ Alicante, Dept Lenguajes & Sistemas Informat, Alicante, Spain
关键词
ETL processes; data warehouses; conceptual modeling; UML;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data warehouses (DWs) are complex computer systems whose main goal is to facilitate the decision making process of knowledge workers. ETL (Extraction-Transformation-Loading) processes are responsible for the extraction of data from heterogeneous operational data sources, their transformation (conversion, cleaning, normalization, etc.) and their loading into DWs. ETL processes are a key component of DWs because incorrect or misleading data will produce wrong business decisions, and therefore, a correct design of these processes at early stages of a DW project is absolutely necessary to improve data quality. However, not much research has dealt with the modeling of ETL processes. In this paper, we present our approach, based on the Unified Modeling Language (UML), which allows us to accomplish the conceptual modeling of these ETL processes. We provide the necessary mechanisms for an easy and quick specification of the common operations defined in these ETL processes such as, the integration of different data sources, the transformation between source and target attributes, the generation of surrogate keys and so on. Another advantage of our proposal is the use of the UML (standardization, ease-of-use and functionality) and the seamless integration of the design of the ETL processes with the DW conceptual schema.
引用
收藏
页码:307 / 320
页数:14
相关论文
共 50 条
  • [1] Modelling ETL Processes of Data Warehouses with UML Activity Diagrams
    Munoz, Lilia
    Mazon, Jose-Norberto
    Pardillo, Jesus
    Trujillo, Juan
    [J]. ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2008 WORKSHOPS, 2008, 5333 : 44 - +
  • [2] Systematic review and comparison modeling ETL processes in data warehouses
    Munoz, Lilia
    Mazon, Jose-Norberto
    Trujillo, Juan
    [J]. SISTEMAS Y TECNOLOGIAS DE INFORMACION, 2010, : 210 - 215
  • [3] Optimizing ETL processes in data warehouses
    Simitsis, A
    Vassiliadis, P
    Sellis, T
    [J]. ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 564 - 575
  • [4] A family of experiments to validate measures for UML activity diagrams of ETL processes in data warehouses
    Munoz, Lilia
    Mazon, Jose-Norberto
    Trujillo, Juan
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2010, 52 (11) : 1188 - 1203
  • [5] A UML profile for multidimensional modeling in data warehouses
    Lujan-Mora, Sergio
    Trujillo, Juan
    Song, Il-Yeol
    [J]. DATA & KNOWLEDGE ENGINEERING, 2006, 59 (03) : 725 - 769
  • [6] Modelling of Data Extraction in ETL Processes Using UML 2.0
    Mrunalini, M.
    Kumar, T. V. Suresh
    Geetha, D. Evangelin
    Rajanikanth, K.
    [J]. DESIDOC JOURNAL OF LIBRARY & INFORMATION TECHNOLOGY, 2006, 26 (05): : 3 - 9
  • [7] ETL Process Modeling Conceptual for Data Warehouses: A Systematic Mapping Study
    Munoz, L.
    Mazon, J. N.
    Trujillo, J.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2011, 9 (03) : 360 - 365
  • [8] Integrating clustering data mining into the multidimensional modeling of Data Warehouses with UML profiles
    Zubcoff, Jose
    Pardillo, Jesus
    Trujillo, Juan
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2007, 4654 : 199 - +
  • [9] AN APPROACH FOR DESIGNING, MODELING AND REALIZING ETL PROCESSES BASED ON UNIFIED VIEWS MODEL
    Song, Xudong
    Liu, Xiaobing
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2011, 21 (04) : 543 - 570
  • [10] Optimized Incremental ETL Jobs for Maintaining Data Warehouses
    Behrend, Andreas
    Joerg, Thomas
    [J]. PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM (IDEAS '10), 2010, : 216 - 224