Performance and energy impact of parallelization and vectorization techniques in modern microprocessors

被引:0
|
作者
Juan M. Cebrián
Lasse Natvig
Jan Christian Meyer
机构
[1] NTNU,Department of Computer and Information Science (IDI)
[2] NTNU,High Performance Computing Section, IT Department
来源
Computing | 2014年 / 96卷
关键词
Performance evaluation; Energy efficiency; Vectorization; 68-04;
D O I
暂无
中图分类号
学科分类号
摘要
While Moore’s law states that the number of transistors is approximately doubled every 2 years, powering these transistors simultaneously is only possible as long as Dennard scaling continues. Unfortunately, voltage scaling has slowed down in recent years, and microprocessor designers have hit what is known as the “utilization wall” or the “dark silicon” effect. Vectorization, parallelization, specialization and heterogeneity are the key approaches to deal with this utilization wall. However, how software developers can maximize energy efficiency of these architectures remains an open question. This paper presents an energy evaluation of parallelization using both physical and logical cores (i.e., SMT/Hyper-Threading), vectorization (SSE, Advanced Vector Extensions and NEON) and dynamic core reconfiguration [Intel®\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {Intel}^{\circledR }$$\end{document}’s Turbo Boost Technology (TBT)]. The evaluation spans microprocessors for embedded, laptop, desktop and server markets, since there is a convergence among them towards energy efficiency. The analyzed processors include Intel’s CoreTM\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\mathrm{TM}$$\end{document} i5 and i7 family and ARM®\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circledR }$$\end{document}’s CortexTM\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\mathrm{TM}$$\end{document} A9 and A15. Results show that software developers should prioritize vectorization over thread parallelism when possible, as it yields better energy efficiency, especially on the Intel platforms. Application scalability can be reduced drastically when using vectorization and threading simultaneously since vectorization increases pressure on the memory subsystem. Intel’s TBT further improves energy efficiency by an additional 10–20 % depending on the number of active threads.
引用
收藏
页码:1179 / 1193
页数:14
相关论文
共 50 条
  • [1] Performance and energy impact of parallelization and vectorization techniques in modern microprocessors
    Cebrian, Juan M.
    Natvig, Lasse
    Meyer, Jan Christian
    [J]. COMPUTING, 2014, 96 (12) : 1179 - 1193
  • [2] The impact of vectorization and parallelization of the slope algorithm on performance and energy efficiency on multi-core architecture
    Bylina, Beata
    Potiopa, Joanna
    Klisowski, Michal
    Bylina, Jaroslaw
    [J]. PROCEEDINGS OF THE 2021 16TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2021, : 283 - 290
  • [3] On the Impact of Performance Faults in Modern Microprocessors
    Naghmeh Karimi
    Michail Maniatakos
    Chandrasekharan (Chandra) Tirumurti
    Yiorgos Makris
    [J]. Journal of Electronic Testing, 2013, 29 : 351 - 366
  • [4] On the Impact of Performance Faults in Modern Microprocessors
    Karimi, Naghmeh
    Maniatakos, Michail
    Tirumurti, Chandrasekharan
    Makris, Yiorgos
    [J]. JOURNAL OF ELECTRONIC TESTING-THEORY AND APPLICATIONS, 2013, 29 (03): : 351 - 366
  • [5] Improving the performance of the needleman-wunsch algorithm using parallelization and vectorization techniques
    Jararweh, Yaser
    Al-Ayyoub, Mahmoud
    Fakirah, Maged
    Alawneh, Luay
    Gupta, Brij B.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (04) : 3961 - 3977
  • [6] Improving the performance of the needleman-wunsch algorithm using parallelization and vectorization techniques
    Yaser Jararweh
    Mahmoud Al-Ayyoub
    Maged Fakirah
    Luay Alawneh
    Brij B. Gupta
    [J]. Multimedia Tools and Applications, 2019, 78 : 3961 - 3977
  • [7] Impact Analysis of Performance Faults in Modern Microprocessors
    Karimi, Naghmeh
    Maniatakos, Michail
    Tirumurti, Chandra
    Jas, Abhijit
    Makris, Yiorgos
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2009, : 91 - +
  • [8] VECTORIZATION AND PARALLELIZATION ON HIGH-PERFORMANCE COMPUTERS
    SEKERA, Z
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 1992, 73 (1-3) : 113 - 138
  • [9] Evaluation of vectorization/parallelization techniques: Application to nonparametric curve estimation
    DoalloBiempica, R
    FraguelaRodriguez, BB
    QuintelaDelRio, A
    [J]. STATISTICS AND COMPUTING, 1996, 6 (04) : 347 - 351
  • [10] Assessing the Impact of Hard Faults in Performance Components of Modern Microprocessors
    Foutris, Nikos
    Gizopoulos, Dimitris
    Kalamatianos, John
    Sridharan, Vilas
    [J]. 2013 IEEE 31ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2013, : 207 - 214