HaoLap: A Hadoop based OLAP system for big data

被引:29
|
作者
Song, Jie [1 ]
Guo, Chaopeng [1 ]
Wang, Zhi [1 ]
Zhang, Yichan [1 ]
Yu, Ge [2 ]
Pierson, Jean-Marc [3 ]
机构
[1] Northeastern Univ, Software Coll, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Sch Informat & Engn, Shenyang 110819, Peoples R China
[3] Univ Toulouse 3, Lab IRIT, F-31062 Toulouse, France
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
Cloud data warehouse; Multidimensional data model; MapReduce;
D O I
10.1016/j.jss.2014.09.024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, facing information explosion, industry and academia have adopted distributed file system and MapReduce programming model to address new challenges the big data has brought. Based on these technologies, this paper presents HaoLap (Hadoop based oLap), an OLAP (OnLine Analytical Processing) system for big data. Drawing on the experience of Multidimensional OLAP (MOLAP), HaoLap adopts the specified multidimensional model to map the dimensions and the measures; the dimension coding and traverse algorithm to achieve the roll up operation on dimension hierarchy; the partition and linearization algorithm to store dimensions and measures; the chunk selection algorithm to optimize OLAP performance; and MapReduce to execute OLAP. The paper illustrates the key techniques of HaoLap including system architecture, dimension definition, dimension coding and traversing, partition, data storage, OLAP and data loading algorithm. We evaluated HaoLap on a real application and compared it with Hive, HadoopDB, HBaseLattice, and Olap4Cloud. The experiment results show that HaoLap boost the efficiency of data loading, and has a great advantage in the OLAP performance of the data set size and query complexity, and meanwhile HaoLap also completely support dimension operations. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:167 / 181
页数:15
相关论文
共 50 条
  • [41] Big data and Spark: Comparison with Hadoop
    Benlachmi, Yassine
    Hasnaoui, Moulay Lahcen
    [J]. PROCEEDINGS OF THE 2020 FOURTH WORLD CONFERENCE ON SMART TRENDS IN SYSTEMS, SECURITY AND SUSTAINABILITY (WORLDS4 2020), 2020, : 811 - 817
  • [42] Handling Big Data with Hadoop Toolkit
    Devakunchari, R.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
  • [43] Benchmarking Big Data OLAP NoSQL Databases
    El Malki, Mohammed
    Kopliku, Arlind
    Sabir, Essaid
    Teste, Olivier
    [J]. UBIQUITOUS NETWORKING, UNET 2018, 2018, 11277 : 82 - 94
  • [44] Research on Industry Data Analysis Model Based on Hadoop Big Data Platform
    Xu, Hongsheng
    Fan, Ganglong
    Li, Ke
    [J]. PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT, INFORMATION AND COMPUTER SCIENCE (ICEMC 2017), 2017, 73 : 783 - 787
  • [45] Hadoop: Addressing Challenges of Big Data
    Singh, Kamalpreet
    Kaur, Ravinder
    [J]. SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 686 - 689
  • [46] Big Data and Hadoop -A Technological Survey
    Manwal, Manika
    Gupta, Amit
    [J]. 2017 INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN COMPUTING AND COMMUNICATION TECHNOLOGIES (ICETCCT), 2017, : 268 - 273
  • [47] A Review on Big Data and Hadoop Security
    Khaloufi, Hayat
    Beni-Hssane, Abderrahim
    Abouelmehdi, Karim
    Saadi, Mostafa
    [J]. Networked Systems, NETYS 2016, 2016, 9944 : 386 - 386
  • [48] Role of Hadoop in Big Data Handling
    Meenakshi
    Ramachandra, A. C.
    Thippeswamy, M. N.
    Bailakare, Ajith
    [J]. INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 482 - 491
  • [49] Building a novel physical design of a distributed big data warehouse over a Hadoop cluster to enhance OLAP cube query performance
    Ramdane, Yassine
    Boussaid, Omar
    Boukraa, Doulkifli
    Kabachi, Nadia
    Bentayeb, Fadila
    [J]. PARALLEL COMPUTING, 2022, 111
  • [50] Attribute based honey encryption algorithm for securing big data: Hadoop distributed file system perspective
    Kapil, Gayatri
    Agrawal, Alka
    Attaallah, Abdulaziz
    Algarni, Abdullah
    Kumar, Rajeev
    Khan, Raees Ahmad
    [J]. PEERJ COMPUTER SCIENCE, 2020, PeerJ Inc. (2020) : 1 - 31