In-memory Query System for Scientific Datasets

被引:4
|
作者
Hsuan-Te, Chiu [1 ]
Chou, Jerry [1 ]
Vishwanath, Venkat [2 ]
Wu, Kesheng [3 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu 30013, Taiwan
[2] Argonne Natl Lab, Argonne, IL 60439 USA
[3] Lawrence Berkeley Natl Lab, Berkeley, CA USA
关键词
In-situ computing; query-driven analysis; indexing; scientific data; distributed shared memory;
D O I
10.1109/ICPADS.2015.53
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The growing gap between compute performance and I/O bandwidth coupled with the increasing data volumes has resulted in a bottleneck to the traditional post-simulation data processing method. Hence in-situ computing and query-driven data analysis are important techniques to minimize data movement. By taking advantage of the growing memory capacity on supercomputers, we developed an in-memory query system for scientific data analysis. Our approach is a combination of bitmap indexing, spatial data layout re-organization, distributed shared memory, and location-aware parallel execution. Our evaluations using real scientific datasets showed that we can aggregate the memory capacity from thousands of computes nodes to analyze a 750GB simulation dataset without transferring data to remote nodes or storage systems. Comparing to traditional solutions based on out-of-core parallel file systems, we achieve significant higher query performance.
引用
收藏
页码:362 / 371
页数:10
相关论文
共 50 条
  • [31] Designing an Efficient Persistent In-Memory File System
    Sha, Edwin H. -M.
    Chen, Xianzhang
    Zhuge, Qingfeng
    Shi, Liang
    Jiang, Weiwen
    2015 IEEE NON-VOLATILE MEMORY SYSTEMS AND APPLICATIONS SYMPOSIUM (NVMSA), 2015,
  • [32] DITA: A Distributed In-Memory Trajectory Analytics System
    Shang, Zeyuan
    Li, Guoliang
    Bao, Zhifeng
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1681 - 1684
  • [33] MapReuse : Reusing Computation in an In-Memory MapReduce System
    Tiwari, Devesh
    Solihin, Yan
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [34] eXtremeDB in-memory embedded database system software
    不详
    AIRCRAFT ENGINEERING AND AEROSPACE TECHNOLOGY, 2009, 81 (05): : 485 - 486
  • [35] Experimental Demonstration of In-Memory Computing in a Ferrofluid System
    Crepaldi, Marco
    Mohan, Charanraj
    Garofalo, Erik
    Adamatzky, Andrew
    Szacilowski, Konrad
    Chiolerio, Alessandro
    ADVANCED MATERIALS, 2023, 35 (23)
  • [36] Interactive Transaction Processing for In-Memory Database System
    Zhu, Tao
    Wang, Donghui
    Hu, Huiqi
    Qian, Weining
    Wang, Xiaoling
    Zhou, Aoying
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2018), PT II, 2018, 10828 : 228 - 246
  • [37] Protego: In-Memory Version Control System in the Cloud
    Gioachin, Filippo
    Liang, Qianhui
    Yao, Yuxia
    Lee, Bu-Sung
    2012 19TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), VOL 1, 2012, : 232 - 239
  • [38] An Adaptive Sub-Sampling Method for in-memory Compression of Scientific Data
    Unat, Didem
    Hromadka, Theodore, III
    Baden, Scott B.
    DCC 2009: 2009 DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2008, : 262 - 271
  • [39] On the Use of In-Memory Analytics Workflows to Compute eScience Indicators from Large Climate Datasets
    D'Anca, Alessandro
    Palazzo, Cosimo
    Elia, Donatello
    Fiore, Sandro
    Bistinas, Ioannis
    Bottcher, Kristin
    Bennett, Victoria
    Aloisio, Giovanni
    2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 1035 - 1043
  • [40] Oracle Database In-Memory: A Dual Format In-Memory Database
    Lahiri, Tirthankar
    Chavan, Shasank
    Colgan, Maria
    Das, Dinesh
    Ganesh, Amit
    Gleeson, Mike
    Hase, Sanket
    Holloway, Allison
    Kamp, Jesse
    Lee, Teck-Hua
    Loaiza, Juan
    Macnaughton, Neil
    Marwah, Vineet
    Mukherjee, Niloy
    Mullick, Atrayee
    Muthulingam, Sujatha
    Raja, Vivekanandhan
    Roth, Marty
    Soylemez, Ekrem
    Zait, Mohamed
    2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 1253 - 1258