Benchmarking SQL on MapReduce systems using large astronomy databases

被引:0
|
作者
Amin Mesmoudi
Mohand-Saïd Hacid
Farouk Toumani
机构
[1] Université de Lyon,
[2] CNRS,undefined
[3] Université Lyon 1,undefined
[4] LIRIS,undefined
[5] UMR5205,undefined
[6] Université Blaise Pascal,undefined
[7] CNRS,undefined
[8] LIMOS - UMR CNRS 6158,undefined
来源
关键词
LSST; DBMS; Benchmark; Distributed systems ; MapReduce; SQL;
D O I
暂无
中图分类号
学科分类号
摘要
In the era of bigdata, with a massive set of digital information of unprecedented volumes being collected and/or produced in several application domains, it becomes more and more difficult to manage and query large data repositories. In the framework of the PetaSky project (http://com.isima.fr/Petasky), we focus on the problem of managing scientific data in the field of cosmology. The data we consider are those of the LSST project (http://www.lsst.org/). The overall size of the database that will be produced is expected to exceed 60 PB (Lsst data challenge handbook, 2012). In order to evaluate the performances of existing SQL On MapReduce data management systems, we conducted extensive experiments by using data and queries from the area of cosmology. The goal of this work is to report on the ability of such systems to support large scale declarative queries. We mainly investigated the impact of data partitioning, indexing and compression on query execution performances.
引用
收藏
页码:347 / 378
页数:31
相关论文
共 50 条
  • [21] Distributed Top-k Keyword Search over Very Large Databases with MapReduce
    Yu, Ziqiang
    Yu, Xiaohui
    Chen, Yuehui
    Ma, Kun
    2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016, 2016, : 349 - 352
  • [23] Implementation of fuzzy classification in relational databases using conventional SQL querying
    Veryha, Y
    INFORMATION AND SOFTWARE TECHNOLOGY, 2005, 47 (05) : 357 - 364
  • [24] Performance of AIS geoinformation extraction using SQL and NoSQL TranStat databases
    Czaplinski, Wojciech
    Gasowski, Wojciech
    SCIENTIFIC JOURNALS OF THE MARITIME UNIVERSITY OF SZCZECIN-ZESZYTY NAUKOWE AKADEMII MORSKIEJ W SZCZECINIE, 2022, 71 (143): : 93 - 101
  • [25] Mobile Databases - Synchronization & Conflict Resolution Strategies using SQL Server
    Ajila, Samuel A.
    Al-Asaad, Ahmed
    2011 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2011, : 487 - 489
  • [26] Routing protocols for transmitting large databases or multi-databases systems
    Hong, H. (ysz313@sina.com), 2001, Wuhan University (06): : 1 - 2
  • [27] Benchmarking Modern Databases for Storing and Profiling Very Large Scale HPC Communication Data
    Kousha, Pouya
    Zhou, Qinghua
    Subramoni, Hari
    Panda, Dhableswar K.
    BENCHMARKING, MEASURING, AND OPTIMIZING, BENCH 2023, 2024, 14521 : 104 - 119
  • [28] The Family of MapReduce and Large-Scale Data Processing Systems
    Sakr, Sherif
    Liu, Anna
    Fayoumi, Ayman G.
    ACM COMPUTING SURVEYS, 2013, 46 (01)
  • [29] R-peak Detector Benchmarking using FieldWiz and Physionet Databases
    Rodrigues, Tiago
    Silva, Hugo
    Fred, Ana
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1, 2020, : 302 - 309
  • [30] SQL AND NOSQL DATABASES FOR CYBER PHYSICAL PRODUCTION SYSTEMS IN INTERNET OF THINGS FOR MANUFACTURING (IOTFM)
    Gamero, David
    Dugenske, Andrew
    Kurfess, Thomas
    Saldana, Christopher
    Fu, Katherine
    PROCEEDINGS OF THE ASME 2021 16TH INTERNATIONAL MANUFACTURING SCIENCE AND ENGINEERING CONFERENCE (MSEC2021), VOL 2, 2021,