Logical Schema for Data Warehouse on Column-Oriented NoSQL Databases

被引:16
|
作者
Boussahoua, Mohamed [1 ]
Boussaid, Omar [1 ]
Bentayeb, Fadila [1 ]
机构
[1] Univ Lumiere Lyon 2, ERIC, EA 3083, 5 Ave Pierre Mendes France, F-69676 Bron, France
关键词
Data warehouses; NoSQL databases; Columns family;
D O I
10.1007/978-3-319-64471-4_20
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The column-oriented NoSQL systems propose a flexible and highly denormalized data schema that facilitates data warehouse scalability. However, the implementation process of data warehouses with NoSQL databases is a challenging task as it involves a distributed data management policy on multi-nodes clusters. Indeed, in column-oriented NoSQL systems, the query performances can be improved by a careful data grouping. In this paper, we present a method that uses clustering techniques, in particular k-means, to model the better form of column families, from existing fact and dimensional tables. To validate our method, we adopt TPC-DS data benchmark. We have conducted several experiments to examine the benefits of clustering techniques for the creation of column families in a column-oriented NoSQL HBase database on Hadoop platform. Our experiments suggest that defining a good data grouping on HBase database during the implementation of a data warehouse increases significantly the performance of the decisional queries.
引用
收藏
页码:247 / 256
页数:10
相关论文
共 50 条
  • [41] Self-adapting data migration in the context of schema evolution in NoSQL databases
    Andrea Hillenbrand
    Uta Störl
    Shamil Nabiyev
    Meike Klettke
    [J]. Distributed and Parallel Databases, 2022, 40 : 5 - 25
  • [42] MDICA: Maintenance of data integrity in column-oriented database applications
    Suarez-Cabal, Maria Jose
    Suarez-Otero, Pablo
    de la Riva, Claudio
    Tuya, Javier
    [J]. COMPUTER STANDARDS & INTERFACES, 2023, 83
  • [43] Document-Oriented Data Schema for Relational Database Migration to NoSQL
    Hamouda, Shady
    Zainol, Zurinahni
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA INNOVATIONS AND APPLICATIONS (INNOVATE-DATA), 2017, : 43 - 50
  • [44] Materialization strategies in a column-oriented DBMS
    Abadi, Daniel J.
    Myers, Daniel S.
    DeWitt, David J.
    Madden, Samuel R.
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 441 - +
  • [45] ECOS: Evolutionary Column-Oriented Storage
    Rahman, Syed Saif Ur
    Schallehn, Eike
    Saake, Gunter
    [J]. ADVANCES IN DATABASES, 2011, 7051 : 18 - 32
  • [46] Column-Oriented Storage Techniques for MapReduce
    Floratou, Avrilia
    Patel, Jignesh M.
    Shekita, Eugene J.
    Tata, Sandeep
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (07): : 419 - 429
  • [47] VLog: A Column-Oriented Datalog Reasoner
    Urbani, Jacopo
    Jacobs, Ceriel
    Kroetzsch, Markus
    [J]. KI 2016: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2016, 9904 : 230 - 236
  • [48] A Generic Schema Evolution Approach for NoSQL and Relational Databases
    Chillon, Alberto Hernandez
    Klettke, Meike
    Ruiz, Diego Sevilla
    Molina, Jesus Garcia
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (07) : 2774 - 2789
  • [49] Schema Extraction in NoSQL Databases: A Systematic Literature Review
    Belefqih, Saad
    Zellou, Ahmed
    Berquedich, Mouna
    [J]. Recent Advances in Computer Science and Communications, 2024, 17 (08) : 92 - 104
  • [50] UMLtoNoSQL: Automatic Transformation of Conceptual Schema to NoSQL Databases
    Abdelhedi, Fatma
    Ait Brahim, Amal
    Atigui, Faten
    Zurfluh, Gilles
    [J]. 2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 272 - 279