Optimizing the LU Factorization for Energy Efficiency on a Many-Core Architecture

被引:4
|
作者
Garcia, Elkin [1 ]
Arteaga, Jaime [1 ]
Pavel, Robert [1 ]
Gao, Guang R. [1 ]
机构
[1] Univ Delaware, Dept Elect & Comp Engn, CAPSL, Newark, DE 19716 USA
关键词
OPTIMIZATION; MODEL;
D O I
10.1007/978-3-319-09967-5_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Power consumption and energy efficiency have become a major bottleneck in the design of new systems for high performance computing. The path to exa-scale computing requires new strategies that decrease the energy consumption of modern many-core architectures without sacrificing scalability or performance. The development of these strategies demands the use of scalable models for energy consumption and the reorientation of optimization techniques to focus on energy efficiency, evaluating their trade-offs with respect to performance. In this paper, we investigate several optimization techniques to reduce the energy consumption on many-core architectures with a software-managed memory hierarchy. We study the impact of these techniques on the Static Energy and the Dynamic Energy of the LU factorization benchmark using a scalable energy consumption model. The main contributions of this paper are: (1) The modeling and analysis of energy consumption and energy efficiency for LU factorization; (2) the study and design of instruction-level and task-level optimizations for the reduction of the Static and Dynamic Energy; (3) the design and implementation of an energy aware tiling that decreases the Dynamic Energy of power hungry instructions in the LU factorization benchmark; and (4) the experimental evaluation of the scalability and improvement in terms of energy consumption and power efficiency of the proposed optimizations using the IBM Cyclops-64 many-core architecture. We study the trade-offs between performance and power efficiency for the proposed optimizations. Our results for the LU factorization benchmark, using 156 hardware thread units, show an improvement in power efficiency between 1.68X and 4.87X for different matrix sizes. In addition, we point out examples of optimizations that scale in performance but not necessarily in power efficiency.
引用
收藏
页码:237 / 251
页数:15
相关论文
共 50 条
  • [21] Branch and Bound Algorithm for Parallel Many-Core Architecture
    Hazama, Kazuki
    Ebara, Hiroyuki
    2018 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2018), 2018, : 272 - 277
  • [22] A compilation framework of dataflow programs for many-core architecture
    Yu, J.-Q. (yjqing@hust.edu.cn), 1600, Science Press (37):
  • [23] Direct approaches to exploit many-core architecture in bioinformatics
    Esteban, Francisco J.
    Diaz, David
    Hernandez, Pilar
    Caballero, Juan A.
    Dorado, Gabriel
    Galvez, Sergio
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2013, 29 (01): : 15 - 26
  • [24] FROM GPGPU TO MANY-CORE: NVIDIA FERMI AND INTEL MANY INTEGRATED CORE ARCHITECTURE
    Heinecke, Alexander
    Klemm, Michael
    Bungartz, Hans-Joachim
    COMPUTING IN SCIENCE & ENGINEERING, 2012, 14 (02) : 78 - 83
  • [25] SCC: A FLEXIBLE ARCHITECTURE FOR MANY-CORE PLATFORM RESEARCH
    Gries, Matthias
    Hoffmann, Ulrich
    Konow, Michael
    Riepen, Michael
    COMPUTING IN SCIENCE & ENGINEERING, 2011, 13 (06) : 79 - 83
  • [26] Modeling and Simulation of a Many-Core Architecture Using SystemC
    Silva, Ana Rita
    Jose, Wilson
    Neto, Horacio
    Vestias, Mario
    CONFERENCE ON ELECTRONICS, TELECOMMUNICATIONS AND COMPUTERS - CETC 2013, 2014, 17 : 146 - 153
  • [27] A Many-core Architecture for In-Memory Data Processing
    Agrawal, Sandeep R.
    Idicula, Sam
    Raghavan, Arun
    Vlachos, Evangelos
    Govindaraju, Venkatraman
    Varadarajan, Venkatanathan
    Balkesen, Cagri
    Giannikis, Georgios
    Roth, Charlie
    Agarwal, Nipun
    Sedlar, Eric
    50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 245 - 258
  • [28] Sparse Matrix Multiplication on a Reconfigurable Many-Core Architecture
    Pinhao, Joao
    Jose, Wilson
    Neto, Horacio
    Vestias, Mario
    2015 EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2015, : 330 - 336
  • [29] Optimized Dense Matrix Multiplication on a Many-Core Architecture
    Garcia, Elkin
    Venetis, Ioannis E.
    Khan, Rishi
    Gao, Guang R.
    EURO-PAR 2010 - PARALLEL PROCESSING, PART II, 2010, 6272 : 316 - +
  • [30] Reconfigurable architecture for heterogeneous multi-core and many-core architecture with IoT assistance
    Xing X.
    Cao J.
    Zhou H.
    Song L.
    Qiu Y.
    International Journal of High Performance Systems Architecture, 2021, 10 (3-4) : 162 - 173