In-Cache Query Co-Processing on Coupled CPU-GPU Architectures

被引：50

作者：

He, Jiong ^{[1
]}

Zhang, Shuhao ^{[1
]}

He, Bingsheng ^{[1
]}

机构：

[1] Nanyang Technol Univ, Singapore, Singapore

来源：

PROCEEDINGS OF THE VLDB ENDOWMENT | 2014年 / 8卷 / 04期

关键词：

D O I：

10.14778/2735496.2735497

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, there have been some emerging processor designs that the CPU and the GPU (Graphics Processing Unit) are integrated in a single chip and share Last Level Cache (LLC). However, the main memory bandwidth of such coupled CPU-GPU architectures can be much lower than that of a discrete GPU. As a result, current GPU query co-processing paradigms can severely suffer from memory stalls. In this paper, we propose a novel in-cache query co-processing paradigm for main memory On-Line Analytical Processing (OLAP) databases on coupled CPU-GPU architectures. Specifically, we adapt CPU-assisted prefetching to minimize cache misses in GPU query co-processing and CPU-assisted decompression to improve query execution performance . Furthermore, we develop a cost model guided adaptation mechanism for distributing the workload of prefetching, decompression, and query execution between CPU and GPU. We implement a system prototype and evaluate it on two recent AMD APUs A8 and A10. The experimental results show that 1) in-cache query co-processing can effectively improve the performance of the state-of-the-art GPU co-processing paradigm by up to 30% and 33% on A8 and A10, respectively, and 2) our workload distribution adaption mechanism can significantly improve the query performance by up to 36% and 40% on A8 and A10, respectively.

引用

页码：329 / 340

页数：12

共 50 条

[1] Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture
He, Jiong
Lu, Mian
He, Bingsheng
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (10): : 889 - 900
[2] FineQuery: Fine-Grained Query Processing on CPU-GPU Integrated Architectures
Wang, Dalin
Zhang, Feng
Wan, Weitao
Li, Hourun
Du, Xiaoyong
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, : 355 - 365
[3] WCET Analysis of the Shared Data Cache in Integrated CPU-GPU Architectures
Huangfu, Yijie
Zhang, Wei
[J]. 2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
[4] Exploring Query Processing on CPU-GPU Integrated Edge Device
Liu, Jiesong
Zhang, Feng
Li, Hourun
Wang, Dalin
Wan, Weitao
Fang, Xiaokun
Zhai, Jidong
Du, Xiaoyong
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4057 - 4070
[5] Demo: Accelerating Depth-Map on Mobile Device Using CPU-GPU Co-processing
Fasogbon, Peter
Aksu, Emre
Heikkila, Lasse
[J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2019, PT I, 2019, 11678 : 75 - 86
[6] Optimizing B+-Tree Searches on Coupled CPU-GPU Architectures
Huang, Han
Luan, Hua
[J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT I, 2020, 12452 : 401 - 415
[7] Rethinking Insertions to B+-Trees on Coupled CPU-GPU Architectures
Huang, Han
Luan, Hua
[J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 993 - 1001
[8] OSCAR: Orchestrating STT-RAM Cache Traffic for Heterogeneous CPU-GPU Architectures
Zhan, Jia
Kayiran, Onur
Loh, Gabriel H.
Das, Chita R.
Xie, Yuan
[J]. 2016 49TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2016,
[9] CPU-Assisted GPGPU on Fused CPU-GPU Architectures
Yang, Yi
Xiang, Ping
Mantor, Mike
Zhou, Huiyang
[J]. 2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 103 - 114
[10] Denial of Service in CPU-GPU Heterogeneous Architectures
Wen, Hao
Zhang, Wei
[J]. 2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,

← 1 2 3 4 5 →