Automating and Optimizing Data Transfers for Many-core Coprocessors

被引:0
|
作者
Ren, Bin [1 ]
Ravi, Nishkam [2 ]
Yang, Yi [2 ]
Feng, Min [2 ]
Agrawal, Gagan [1 ]
Chakradhar, Srimat [2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] NEC Labs Amer, Princeton, NJ USA
关键词
Coprocessors; Static Analysis; Runtime Analysis; Offloading;
D O I
10.1145/2597652.2600114
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Orchestrating data transfers between CPUs and a coprocessor manually is cumbersome, particularly for multi-dimensional arrays and other data structures with multi-level pointers, which are common in scientific computations. This work describes a system that includes both compile-time and runtime solutions for this problem, with the overarching goal of improving programmer productivity while maintaining performance. We implemented our best compile-time solution, partial linearization with pointer reset, as a source-to-source transformation, and evaluated our work by multiple C benchmarks. Our experiment results demonstrate that our best compile-time solution can perform 2.5x-5x faster than original runtime solution, and the CPU-Coprocessor code with it can achieve 1.5x-2.5x speedup over the 16-thread CPU version.
引用
收藏
页码:177 / 177
页数:1
相关论文
共 50 条
  • [31] Optimizing memory bandwidth exploitation for OpenVX applications on embedded many-core accelerators
    Giuseppe Tagliavini
    Germain Haugou
    Andrea Marongiu
    Luca Benini
    Journal of Real-Time Image Processing, 2018, 15 : 73 - 92
  • [32] Parallelizing and optimizing a bioinformatics pairwise sequence alignment algorithm for many-core architecture
    Diaz, David
    Jose Esteban, Francisco
    Hernandez, Pilar
    Antonio Caballero, Juan
    Dorado, Gabriel
    Galvez, Sergio
    PARALLEL COMPUTING, 2011, 37 (4-5) : 244 - 259
  • [33] Many-Core Event Evaluation
    Marvie, Jean-Eudes
    Hirtzlin, Patrice
    Gautron, Pascal
    WEB3D 2013: 18TH INTERNATIONAL CONFERENCE ON 3D WEB TECHNOLOGY, 2013, : 181 - 189
  • [34] Teaching Many-Core Programming
    Tsiopoulos, Leonidas
    Johkio, Fareed Ahmed
    Georgakarakos, Georgios
    Dahlin, Andreas
    Lilius, Johan
    10TH EUROPEAN WORKSHOP ON MICROELECTRONICS EDUCATION (EWME), 2014, : 7 - 10
  • [35] Data Criticality in Multithreaded Applications: An Insight for Many-Core Systems
    Das, Abhijit
    Jose, John
    Mishra, Prabhat
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2021, 29 (09) : 1675 - 1679
  • [36] Many-Core Compiler Fuzzing
    Lidbury, Christopher
    Lascu, Andrei
    Chong, Nathan
    Donaldson, Alastair F.
    ACM SIGPLAN NOTICES, 2015, 50 (06) : 65 - 76
  • [37] A Many-core Parallelizing Processor
    Porada, Katarzyna
    2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 875 - 877
  • [38] Efficient Distributed Data Structures for Future Many-core Architectures
    Fatourou, Panagiota
    Kallimanis, Nikolaos D.
    Kanellou, Eleni
    Makridakis, Odysseas
    Symeonidou, Christi
    arXiv,
  • [39] In-Place Data Sliding Algorithms for Many-Core Architectures
    Gomez-Luna, Juan
    Chang, Li-Wen
    Hwu, Wen-Mei W.
    Sung, I-Jui
    Guil, Nicolas
    2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 210 - 219
  • [40] Optimizing the Matrix Multiplication Using Strassen and Winograd Algorithms with Limited Recursions on Many-Core
    Khan, Ayaz ul Hassan
    Al-Mouhamed, Mayez
    Fatayer, Allam
    Mohammad, Nazeeruddin
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2016, 44 (04) : 801 - 830