In-Cache Query Co-Processing on Coupled CPU-GPU Architectures

被引:50
|
作者
He, Jiong [1 ]
Zhang, Shuhao [1 ]
He, Bingsheng [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2014年 / 8卷 / 04期
关键词
D O I
10.14778/2735496.2735497
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, there have been some emerging processor designs that the CPU and the GPU (Graphics Processing Unit) are integrated in a single chip and share Last Level Cache (LLC). However, the main memory bandwidth of such coupled CPU-GPU architectures can be much lower than that of a discrete GPU. As a result, current GPU query co-processing paradigms can severely suffer from memory stalls. In this paper, we propose a novel in-cache query co-processing paradigm for main memory On-Line Analytical Processing (OLAP) databases on coupled CPU-GPU architectures. Specifically, we adapt CPU-assisted prefetching to minimize cache misses in GPU query co-processing and CPU-assisted decompression to improve query execution performance . Furthermore, we develop a cost model guided adaptation mechanism for distributing the workload of prefetching, decompression, and query execution between CPU and GPU. We implement a system prototype and evaluate it on two recent AMD APUs A8 and A10. The experimental results show that 1) in-cache query co-processing can effectively improve the performance of the state-of-the-art GPU co-processing paradigm by up to 30% and 33% on A8 and A10, respectively, and 2) our workload distribution adaption mechanism can significantly improve the query performance by up to 36% and 40% on A8 and A10, respectively.
引用
收藏
页码:329 / 340
页数:12
相关论文
共 50 条
  • [1] Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture
    He, Jiong
    Lu, Mian
    He, Bingsheng
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (10): : 889 - 900
  • [2] FineQuery: Fine-Grained Query Processing on CPU-GPU Integrated Architectures
    Wang, Dalin
    Zhang, Feng
    Wan, Weitao
    Li, Hourun
    Du, Xiaoyong
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, : 355 - 365
  • [3] WCET Analysis of the Shared Data Cache in Integrated CPU-GPU Architectures
    Huangfu, Yijie
    Zhang, Wei
    [J]. 2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
  • [4] Exploring Query Processing on CPU-GPU Integrated Edge Device
    Liu, Jiesong
    Zhang, Feng
    Li, Hourun
    Wang, Dalin
    Wan, Weitao
    Fang, Xiaokun
    Zhai, Jidong
    Du, Xiaoyong
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4057 - 4070
  • [5] Demo: Accelerating Depth-Map on Mobile Device Using CPU-GPU Co-processing
    Fasogbon, Peter
    Aksu, Emre
    Heikkila, Lasse
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2019, PT I, 2019, 11678 : 75 - 86
  • [6] Optimizing B+-Tree Searches on Coupled CPU-GPU Architectures
    Huang, Han
    Luan, Hua
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT I, 2020, 12452 : 401 - 415
  • [7] Rethinking Insertions to B+-Trees on Coupled CPU-GPU Architectures
    Huang, Han
    Luan, Hua
    [J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 993 - 1001
  • [8] OSCAR: Orchestrating STT-RAM Cache Traffic for Heterogeneous CPU-GPU Architectures
    Zhan, Jia
    Kayiran, Onur
    Loh, Gabriel H.
    Das, Chita R.
    Xie, Yuan
    [J]. 2016 49TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2016,
  • [9] CPU-Assisted GPGPU on Fused CPU-GPU Architectures
    Yang, Yi
    Xiang, Ping
    Mantor, Mike
    Zhou, Huiyang
    [J]. 2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 103 - 114
  • [10] Denial of Service in CPU-GPU Heterogeneous Architectures
    Wen, Hao
    Zhang, Wei
    [J]. 2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,