Approximate Computing for Multithreaded Programs in Shared Memory Architectures

被引:0
|
作者
Nongpoh, Bernard [1 ]
Ray, Rajarshi [2 ]
Banerjee, Ansuman [3 ]
机构
[1] Natl Inst Technol Meghalaya, Dept Comp Sci & Engn, Shillong, Meghalaya, India
[2] Indian Assoc Cultivat Sci, Sch Math & Computat Sci, Kolkata, W Bengal, India
[3] Indian Stat Inst, Adv Comp & Microelect Unit, Kolkata, W Bengal, India
关键词
Hypothesis testing; Cache-Coherence; Approximate computing; SENSITIVITY-ANALYSIS; ESCAPE ANALYSIS;
D O I
10.1145/3359986.3361209
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In multicore and multicached architectures, cache coherence is ensured with a coherence protocol. However, the performance benefits of caching diminishes due to the cost associated with the protocol implementation. In this paper, we propose a novel technique to improve the performance of multithreaded programs running on shared-memory multicore processors by embracing approximate computing. Our idea is to relax the coherence requirement selectively in order to reduce the cost associated with a cache-coherence protocol, and at the same time, ensure a bounded QoS degradation with probabilistic reliability. In particular, we detect instructions in a multithreaded program that write to shared data, we call them Shared-Write-Access-Points (SWAPs), and propose an automated statistical analysis to identify those which can tolerate coherence faults. We call such SWAPs approximable. Our experiments on 9 applications from the SPLASH 3.0 benchmarks suite reveal that an average of 57% of the tested SWAPs are approximable. To leverage this observation, we propose an adapted cache-coherence protocol that relaxes the coherence requirement on stores from approximable SWAPs. Additionally, our protocol uses stale values for load misses due to coherence, the stale value being the version at the time of invalidation. We observe an average of 15% reduction in CPU cycles and 11% reduction in energy footprint from architectural simulation of the 9 applications using our approximate execution scheme.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Performance of shared caches on multithreaded architectures
    Chen, YY
    Peir, JK
    King, CT
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 1998, 14 (02) : 499 - 514
  • [2] Approximate weighted matching on emerging manycore and multithreaded architectures
    Halappanavar, Mahantesh
    Feo, John
    Villa, Oreste
    Tumeo, Antonino
    Pothen, Alex
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2012, 26 (04): : 413 - 430
  • [3] Context-Bounded Verification of Liveness Properties for Multithreaded Shared-Memory Programs
    Baumann, Pascal
    Majumdar, Rupak
    Thinniyam, Ramanathan S.
    Zetzsche, Georg
    [J]. PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2021, 5 (POPL):
  • [4] Deep In-Memory Architectures in SRAM: An Analog Approach to Approximate Computing
    Kang, Mingu
    Gonugondla, Sujan K.
    Shanbhag, Naresh R.
    [J]. PROCEEDINGS OF THE IEEE, 2020, 108 (12) : 2251 - 2275
  • [5] Performance pounds for distributed memory multithreaded architectures
    Zuberek, WM
    Govindarajan, R
    [J]. 1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 232 - 237
  • [6] A Tool to Analyze the Performance of Multithreaded Programs on NUMA Architectures
    Liu, Xu
    Mellor-Crummey, John
    [J]. ACM SIGPLAN NOTICES, 2014, 49 (08) : 259 - 271
  • [7] Approximate simulation of distributed-memory multithreaded multiprocessors
    Zuberek, WM
    [J]. 35TH ANNUAL SIMULATION SYMPOSIUM, PROCEEDINGS, 2002, : 107 - 114
  • [8] A multithreaded processor designed for distributed shared memory systems
    Grunewald, W
    Ungerer, T
    [J]. ADVANCES IN PARALLEL AND DISTRIBUTED COMPUTING - PROCEEDINGS, 1997, : 206 - 213
  • [9] Nonvolatile Low-Cost Approximate Spintronic Full Adders for Computing in Memory Architectures
    Rajaei, Ramin
    Amirany, Abdolah
    [J]. IEEE TRANSACTIONS ON MAGNETICS, 2020, 56 (04)
  • [10] Automatic Detection of Shared Objects in Multithreaded Java']Java Programs
    Tolubaeva, Munara
    Can, Aysu Betin
    [J]. 2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 522 - 526