Selection of views to materialize in a data warehouse

被引:119
|
作者
Gupta, H [1 ]
Mumick, IS
机构
[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
[2] Kirusa Inc, Edison, NJ 08817 USA
关键词
views; view selection; data warehouse; materialization;
D O I
10.1109/TKDE.2005.16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e. g., materialization time, storage space, etc. In this article, we have developed a theoretical framework for the general problem of selection of views in a data warehouse. We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz.: 1) an AND view graph, where each query/view has a unique evaluation, e.g., when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, e. g., data cubes. We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution. We extend our heuristic to the general AND-OR view graphs. Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics.
引用
收藏
页码:24 / 43
页数:20
相关论文
共 50 条
  • [31] Processing aggregate queries with materialized views in data warehouse environment
    Chang, JY
    Kim, HJ
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (04): : 726 - 738
  • [32] Selection and classification of external information for the integration in a data warehouse
    Behme, W
    Mucksch, H
    [J]. WIRTSCHAFTSINFORMATIK, 1999, 41 (05): : 443 - +
  • [33] Building the data warehouse for materials selection in mechanical design
    Li, Y
    [J]. ADVANCED ENGINEERING MATERIALS, 2004, 6 (1-2) : 92 - 95
  • [34] ASVMRT: Materialized View Selection Algorithm in Data Warehouse
    Yang, Jin-Hyuk
    Chung, In-Jeong
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2006, 2 (02): : 67 - 75
  • [35] Key organizational factors in data warehouse architecture selection
    Ariyachandra, Thilini
    Watson, Hugh
    [J]. DECISION SUPPORT SYSTEMS, 2010, 49 (02) : 200 - 212
  • [36] Selection of Structures with Grid Optimization, in Multiagent Data Warehouse
    Gorawski, Marcin
    Bankowski, Slawomir
    Gorawski, Michal
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2010, 2010, 6283 : 292 - 299
  • [37] An evolutionary approach to schema partitioning selection in a data warehouse
    Bellatreche, L
    Boukhalfa, K
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2005, 3589 : 115 - 125
  • [38] Optimal Genetic View Selection Algorithm for Data Warehouse
    王自强
    冯博琴
    [J]. Railway Engineering Science, 2005, (01) : 5 - 10
  • [39] Research on Materialized View Selection Algorithm in Data Warehouse
    Zhou Lijuan
    Ge Xuebin
    Wang Linshuang
    Shi Qian
    [J]. 2009 INTERNATIONAL FORUM ON COMPUTER SCIENCE-TECHNOLOGY AND APPLICATIONS, VOL 2, PROCEEDINGS, 2009, : 326 - 329
  • [40] Exploitation of referential integrity constraints for efficient update of data warehouse views
    Leung, CKS
    Lee, W
    [J]. DATABASE: ENTERPRISE, SKILLS AND INNOVATION, PROCEEDINGS, 2005, 3567 : 98 - 110