Specification-based data reduction in dimensional data warehouses

被引:10
|
作者
Skyt, Janne [1 ]
Jensen, Christian S. [1 ]
Pedersen, Torben Bach [1 ]
机构
[1] Univ Aalborg, Dept Comp Sci, DK-9200 Aalborg, Denmark
关键词
data reduction; data warehousing; multidimensional data; data models; physical deletion;
D O I
10.1016/j.is.2007.06.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many data warehouses contain massive amounts of data, accumulated over long periods of time. In some cases, it is necessary or desirable to either delete "old" data or to maintain the data at an aggregate level. This may be due to privacy concerns, in which case the data are aggregated to levels that ensure anonymity. Another reason is the desire to maintain a balance between the uses of data that change as the data age and the size of the data, thus avoiding overly large data warehouses. This paper presents effective techniques for data reduction that enable the gradual aggregation of detailed data as the data ages. With these techniques, data may be aggregated to higher levels as they age, enabling the maintenance of more compact, consolidated data and the compliance with privacy requirements. Special care is taken to avoid semantic problems in the aggregation process. The paper also describes the querying of the resulting data warehouses and an implementation strategy based on current database technology. (C) 2007 Elsevier BN. All rights reserved.
引用
收藏
页码:36 / 63
页数:28
相关论文
共 50 条
  • [1] Specification-based data reduction in dimensional data warehouses
    Skyt, J
    Jensen, CS
    Pedersen, TB
    [J]. 18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 278 - 278
  • [2] A SPECIFICATION-BASED DATA MODEL
    GANDHI, M
    ROBERTSON, EL
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 645 : 194 - 209
  • [3] Towards Agile Integration: Specification-based Data Alignment
    Giossi, Chris
    Maier, David
    Tufte, Kristin
    Gall, Elliot
    Barnes, Melissa
    [J]. 2020 IEEE 21ST INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2020), 2020, : 333 - 340
  • [4] Static specification analysis for termination of specification-based data structure repair
    Demsky, B
    Rinard, M
    [J]. ISSRE 2003: 14TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 2003, : 71 - 84
  • [5] Mutated Specification-Based Test Data Generation with a Genetic Algorithm
    Wang, Rong
    Sato, Yuji
    Liu, Shaoying
    [J]. MATHEMATICS, 2021, 9 (04) : 1 - 19
  • [6] Specification and management of interdependent data in operational systems and data warehouses
    GTE Lab Inc, Waltham, United States
    [J]. Distrib Parallel Databases, 2 (121-166):
  • [7] Specification and management of interdependent data in operational systems and data warehouses
    Georgakopoulos, D
    Karabatis, G
    Gantimahapatruni, S
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 1997, 5 (02) : 121 - 166
  • [8] Specification and Management of Interdependent Data in Operational Systems and Data Warehouses
    Dimitrios Georgakopoulos
    George Karabatis
    Sridhar Gantimahapatruni
    [J]. Distributed and Parallel Databases, 1997, 5 : 121 - 166
  • [9] Goal-directed reasoning for specification-based data structure repair
    Demsky, Brian
    Rinard, Martin C.
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2006, 32 (12) : 931 - 951
  • [10] Specification-Based Intrusion Detection Using Sequence Alignment and Data Clustering
    Kountche, Djibrilla Amadou
    Gombault, Sylvain
    [J]. FUTURE NETWORK SYSTEMS AND SECURITY, FNSS 2015, 2015, 523 : 31 - 46