Selection of views to materialize in a data warehouse

被引:119
|
作者
Gupta, H [1 ]
Mumick, IS
机构
[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
[2] Kirusa Inc, Edison, NJ 08817 USA
关键词
views; view selection; data warehouse; materialization;
D O I
10.1109/TKDE.2005.16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e. g., materialization time, storage space, etc. In this article, we have developed a theoretical framework for the general problem of selection of views in a data warehouse. We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz.: 1) an AND view graph, where each query/view has a unique evaluation, e.g., when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, e. g., data cubes. We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution. We extend our heuristic to the general AND-OR view graphs. Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics.
引用
收藏
页码:24 / 43
页数:20
相关论文
共 50 条
  • [41] Maintaining consistency in partially self-maintainable views at the data warehouse
    Samtani, S
    Kumar, V
    [J]. NINTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 1998, : 206 - 211
  • [42] Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse
    Phuboon-ob, Jiratta
    Auepanwiriyakul, Raweewan
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 19, 2007, 19 : 277 - 281
  • [43] Fulvis: New approach for selecting views to materialize in Hybrid Information Integration
    Hadi, Wadii
    Zellou, Ahmed
    Bounabat, Bouchaib
    [J]. 2013 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSIT), 2013, : 248 - 255
  • [44] Analysis and Comparison on Selection Algorithms of Materialized View in Data Warehouse
    Lin, Qiao
    [J]. 2011 INTERNATIONAL CONFERENCE ON FUTURE COMPUTERS IN EDUCATION (ICFCE 2011), VOL II, 2011, : 132 - 136
  • [45] On solving the view selection problem in distributed data warehouse architectures
    Bauer, A
    Lehner, W
    [J]. SSDBM 2002: 15TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2003, : 43 - 51
  • [46] A Comparative Analysis of Fragmentation Selection Algorithms for Data Warehouse Partitioning
    Thenmozhi, M.
    Vivekanandan, K.
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ENGINEERING AND TECHNOLOGY RESEARCH (ICAETR), 2014,
  • [47] Applying evolutionary algorithms to materialized view selection in a data warehouse
    Horng, JT
    Chang, YJ
    Liu, BJ
    [J]. SOFT COMPUTING, 2003, 7 (08) : 574 - 581
  • [48] Materialized view selection based on query cost in data warehouse
    Zhou, LJ
    Liu, C
    Liu, D
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS, AND TECHNOLOGY VI, 2004, 5433 : 246 - 252
  • [49] Organization and materialization strategies of multidimensional views in railway freight transportation data warehouse
    Lin, Y.F.
    Huang, H.K.
    Tian, S.F.
    [J]. Tiedao Xuebao/Journal of the China Railway Society, 2001, 23 (02):
  • [50] Complex view selection for data warehouse self-maintainability
    Theodoratos, D
    [J]. COOPERATIVE INFORMATION SYSTEMS, PROCEEDINGS, 2000, 1901 : 78 - 89