In-memory Query System for Scientific Datasets

被引:4
|
作者
Hsuan-Te, Chiu [1 ]
Chou, Jerry [1 ]
Vishwanath, Venkat [2 ]
Wu, Kesheng [3 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu 30013, Taiwan
[2] Argonne Natl Lab, Argonne, IL 60439 USA
[3] Lawrence Berkeley Natl Lab, Berkeley, CA USA
关键词
In-situ computing; query-driven analysis; indexing; scientific data; distributed shared memory;
D O I
10.1109/ICPADS.2015.53
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The growing gap between compute performance and I/O bandwidth coupled with the increasing data volumes has resulted in a bottleneck to the traditional post-simulation data processing method. Hence in-situ computing and query-driven data analysis are important techniques to minimize data movement. By taking advantage of the growing memory capacity on supercomputers, we developed an in-memory query system for scientific data analysis. Our approach is a combination of bitmap indexing, spatial data layout re-organization, distributed shared memory, and location-aware parallel execution. Our evaluations using real scientific datasets showed that we can aggregate the memory capacity from thousands of computes nodes to analyze a 750GB simulation dataset without transferring data to remote nodes or storage systems. Comparing to traditional solutions based on out-of-core parallel file systems, we achieve significant higher query performance.
引用
收藏
页码:362 / 371
页数:10
相关论文
共 50 条
  • [21] Memristor-Based Approximate Query Architecture for In-Memory Hyperdimensional Computing
    Yu, Tianyang
    Wu, Bi
    Chen, Ke
    Zhang, Gong
    Liu, Weiqiang
    IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (11) : 2605 - 2618
  • [22] COMPASS: Online Sketch-based Query Optimization for In-Memory Databases
    Izenov, Yesdaulet
    Datta, Asoke
    Rusu, Florin
    Shin, Jun Hyung
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 804 - 816
  • [23] A scalable framework for continuous query evaluations over multidimensional, scientific datasets
    Tolooee, Cameron
    Malensek, Matthew
    Pallickara, Sangmi Lee
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (08): : 2546 - 2563
  • [24] A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets
    Schuiki, Fabian
    Schaffner, Michael
    Gurkaynak, Frank K.
    Benini, Luca
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (04) : 484 - 497
  • [25] PHOENIX - A SAFE IN-MEMORY FILE SYSTEM
    GAIT, J
    COMMUNICATIONS OF THE ACM, 1990, 33 (01) : 81 - 86
  • [26] Improving RDF Query Performance using In-Memory Virtual Columns in Oracle Database
    Chong, Eugene Inseok
    Perry, Matthew
    Das, Souripriya
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1814 - 1819
  • [27] RAPID: In-Memory Analytical Query Processing Engine with Extreme Performance per Watt
    Balkesen, Cagri
    Kunal, Nitin
    Giannikis, Georgios
    Fender, Pit
    Sundara, Seema
    Schmidt, Felix
    Wen, Jarod
    Agrawal, Sandeep
    Raghavan, Arun
    Varadarajan, Venkatanathan
    Viswanathan, Anand
    Chandrasekaran, Balakrishnan
    Idicula, Sam
    Agarwal, Nipun
    Sedlar, Eric
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1407 - 1419
  • [28] Making sense of performance in in-memory computing frameworks for scientific data analysis: A case study of the spark system
    Zhang, Xuechen
    Khanal, Ujjwal
    Zhao, Xinghui
    Ficklin, Stephen
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 120 : 369 - 382
  • [29] iSPEED: a Scalable and Distributed In-Memory Based Spatial Query System for Large and Structurally Complex 3D Data
    Vo, Hoang
    Liang, Yanhui
    Kong, Jun
    Wang, Fusheng
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (12): : 2078 - 2081
  • [30] Avocado: A Secure In-Memory Distributed Storage System
    Bailleu, Maurice
    Giantsidi, Dimitra
    Gavrielatos, Vasilis
    Quoc, Do Le
    Nagarajan, Vijay
    Bhatotia, Pramod
    PROCEEDINGS OF THE 2021 USENIX ANNUAL TECHNICAL CONFERENCE, 2021, : 285 - 301