In-memory Query System for Scientific Datasets

被引:4
|
作者
Hsuan-Te, Chiu [1 ]
Chou, Jerry [1 ]
Vishwanath, Venkat [2 ]
Wu, Kesheng [3 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu 30013, Taiwan
[2] Argonne Natl Lab, Argonne, IL 60439 USA
[3] Lawrence Berkeley Natl Lab, Berkeley, CA USA
关键词
In-situ computing; query-driven analysis; indexing; scientific data; distributed shared memory;
D O I
10.1109/ICPADS.2015.53
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The growing gap between compute performance and I/O bandwidth coupled with the increasing data volumes has resulted in a bottleneck to the traditional post-simulation data processing method. Hence in-situ computing and query-driven data analysis are important techniques to minimize data movement. By taking advantage of the growing memory capacity on supercomputers, we developed an in-memory query system for scientific data analysis. Our approach is a combination of bitmap indexing, spatial data layout re-organization, distributed shared memory, and location-aware parallel execution. Our evaluations using real scientific datasets showed that we can aggregate the memory capacity from thousands of computes nodes to analyze a 750GB simulation dataset without transferring data to remote nodes or storage systems. Comparing to traditional solutions based on out-of-core parallel file systems, we achieve significant higher query performance.
引用
收藏
页码:362 / 371
页数:10
相关论文
共 50 条
  • [1] In-Memory Database Query
    Giannopoulos, Iason
    Singh, Abhairaj
    Le Gallo, Manuel
    Jonnalagadda, Vara Prasad
    Hamdioui, Said
    Sebastian, Abu
    ADVANCED INTELLIGENT SYSTEMS, 2020, 2 (12)
  • [2] Dynamic Query Prioritization for In-Memory Databases
    Wust, Johannes
    Grund, Martin
    Plattner, Hasso
    IN MEMORY DATA MANAGEMENT AND ANALYSIS, 2015, 8921 : 56 - 68
  • [3] Adaptive Concurrent Query Execution Framework for an Analytical In-Memory Database System
    Deshmukh, Harshad
    Memisoglu, Hakan
    Patel, Jignesh M.
    2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, : 23 - 30
  • [4] Dima: A Distributed In-Memory Similarity-Based Query Processing System
    Sun, Ji
    Shang, Zeyuan
    Li, Guoliang
    Deng, Dong
    Bao, Zhifeng
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (12): : 1925 - 1928
  • [5] Parallel Query on the In-Memory Database in a CUDA Platform
    Huang, Yin-Fu
    Chen, Wei-Cheng
    2015 10TH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC), 2015, : 236 - 243
  • [6] Efficient In-Memory Point Cloud Query Processing
    Teuscher, Balthasar
    Geissendoerfer, Oliver
    Luo, Xuanshu
    Li, Hao
    Anders, Katharina
    Holst, Christoph
    Werner, Martin
    RECENT ADVANCES IN 3D GEOINFORMATION SCIENCE, 3D GEOINFO 2023, 2024, : 267 - 286
  • [7] Compression-Aware In-Memory Query Processing: Vision, System Design and Beyond
    Hildebrandt, Juliana
    Habich, Dirk
    Damme, Patrick
    Lehner, Wolfgang
    DATA MANAGEMENT ON NEW HARDWARE, 2017, 10195 : 40 - 56
  • [8] LocationSpark: In-memory Distributed Spatial Query Processing and Optimization
    Tang, Mingjie
    Yu, Yongyang
    Mahmood, Ahmed R.
    Malluhi, Qutaibah M.
    Ouzzani, Mourad
    Aref, Walid G.
    FRONTIERS IN BIG DATA, 2020, 3
  • [9] An In-Memory based Framework for Scientific Data Analytics
    Elia, Donatello
    Fiore, Sandro
    D'Anca, Alessandro
    Palazzo, Cosimo
    Foster, Ian
    Williams, Dean N.
    PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF'16), 2016, : 424 - 429
  • [10] Query Optimization in Oracle 12c Database In-Memory
    Das, Dinesh
    Yan, Jiaqi
    Zait, Mohamed
    Valluri, Satyanarayana R.
    Vyas, Nirav
    Krishnamachari, Ramarajan
    Gaharwar, Prashant
    Kamp, Jesse
    Mukherjee, Niloy
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (12): : 1770 - 1781