Optimizing the LU Factorization for Energy Efficiency on a Many-Core Architecture

被引:4
|
作者
Garcia, Elkin [1 ]
Arteaga, Jaime [1 ]
Pavel, Robert [1 ]
Gao, Guang R. [1 ]
机构
[1] Univ Delaware, Dept Elect & Comp Engn, CAPSL, Newark, DE 19716 USA
关键词
OPTIMIZATION; MODEL;
D O I
10.1007/978-3-319-09967-5_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Power consumption and energy efficiency have become a major bottleneck in the design of new systems for high performance computing. The path to exa-scale computing requires new strategies that decrease the energy consumption of modern many-core architectures without sacrificing scalability or performance. The development of these strategies demands the use of scalable models for energy consumption and the reorientation of optimization techniques to focus on energy efficiency, evaluating their trade-offs with respect to performance. In this paper, we investigate several optimization techniques to reduce the energy consumption on many-core architectures with a software-managed memory hierarchy. We study the impact of these techniques on the Static Energy and the Dynamic Energy of the LU factorization benchmark using a scalable energy consumption model. The main contributions of this paper are: (1) The modeling and analysis of energy consumption and energy efficiency for LU factorization; (2) the study and design of instruction-level and task-level optimizations for the reduction of the Static and Dynamic Energy; (3) the design and implementation of an energy aware tiling that decreases the Dynamic Energy of power hungry instructions in the LU factorization benchmark; and (4) the experimental evaluation of the scalability and improvement in terms of energy consumption and power efficiency of the proposed optimizations using the IBM Cyclops-64 many-core architecture. We study the trade-offs between performance and power efficiency for the proposed optimizations. Our results for the LU factorization benchmark, using 156 hardware thread units, show an improvement in power efficiency between 1.68X and 4.87X for different matrix sizes. In addition, we point out examples of optimizations that scale in performance but not necessarily in power efficiency.
引用
收藏
页码:237 / 251
页数:15
相关论文
共 50 条
  • [41] Automatically Optimizing Stencil Computations on Many-Core NUMA Architectures
    Lin, Pei-Hung
    Yi, Qing
    Quinlan, Daniel
    Liao, Chunhua
    Yan, Yongqing
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2016, 2017, 10136 : 137 - 152
  • [42] Study on Fine-grained Synchronization in Many-Core Architecture
    Yu, Lei
    Liu, Zhiyong
    Fan, Dongrui
    Song, Fenglong
    Zhang, Junchao
    Yuan, Nan
    SNPD 2009: 10TH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCES, NETWORKING AND PARALLEL DISTRIBUTED COMPUTING, PROCEEDINGS, 2009, : 524 - 529
  • [43] Self-Healing Many-Core Architecture: Analysis and Evaluation
    Kamran, Arezoo
    Navabi, Zainalabedin
    VLSI DESIGN, 2016, 2016
  • [44] Parallel Code Generation of Synchronous Programs for a Many-core Architecture
    Graillat, Amaury
    Moy, Matthieu
    Raymond, Pascal
    de Dinechin, Benoit Dupont
    PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 1139 - 1142
  • [45] Scaling Graph Community Detection on the Tilera Many-core Architecture
    Chavarria-Miranda, Daniel
    Halappanavar, Mahantesh
    Kalyanaraman, Ananth
    2014 21ST INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2014,
  • [46] Architecture Decomposition in System Synthesis of Heterogeneous Many-Core Systems
    Richthammer, Valentina
    Schwarzer, Tobias
    Wildermann, Stefan
    Teich, Juergen
    Glass, Michael
    2018 55TH ACM/ESDA/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2018,
  • [47] Design and Analysis of a Many-Core Processor Architecture for Multimedia Applications
    Lai, Jyu-Yuan
    Chen, Po-Yu
    Hsu, Ting-Shuo
    Huang, Chih-Tsun
    Liou, Jing-Jia
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [48] Task Sampling: Computer Architecture Simulation in the Many-Core Era
    Grass, Thomas
    2013 22ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2013, : 405 - 405
  • [49] Exploring performance and energy tradeoffs for irregular applications: A case study on the Tilera many-core architecture
    Panyala, Ajay
    Chavarria-Miranda, Daniel
    Manzano, Joseph B.
    Tumeo, Antonino
    Halappanavar, Mahantesh
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2017, 104 : 234 - 251
  • [50] Circuit Modeling for Practical Many-core Architecture Design Exploration
    Truong, Dean N.
    Baas, Bevan M.
    PROCEEDINGS OF THE 47TH DESIGN AUTOMATION CONFERENCE, 2010, : 627 - 628