Locating Cache Performance Bottlenecks Using Data Profiling

被引:0
|
作者
Pesterev, Aleksey [1 ]
Zeldovich, Nickolai [1 ]
Morris, Robert T. [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
关键词
Cache Misses; Data Profiling; Debug Registers; Statistical Profiling;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Effective use of CPU data caches is critical to good performance, but poor cache use patterns are often hard to spot using existing execution profiling tools. Typical profilers attribute costs to specific code locations. The costs due to frequent cache misses on a given piece of data, however, may be spread over instructions throughout the application. The resulting individually small costs at a large number of instructions can easily appear insignificant in a code profiler's output. DProf helps programmers understand cache miss costs by attributing misses to data types instead of code. Associating cache misses with data helps programmers locate data structures that experience misses in many places in the application's code. DProf introduces a number of new views of cache miss data, including a data profile, which reports the data types with the most cache misses, and a data flow graph, which summarizes how objects of a given type are accessed throughout their lifetime, and which accesses incur expensive cross-CPU cache loads. We present two case studies of using DProf to find and fix cache performance bottlenecks in Linux. The improvements provide a 16-57% throughput improvement on a range of memcached and Apache workloads.
引用
收藏
页码:335 / 348
页数:14
相关论文
共 50 条
  • [41] Detecting past population bottlenecks using temporal genetic data
    Ramakrishnan, U
    Hadly, EA
    Mountain, JL
    MOLECULAR ECOLOGY, 2005, 14 (10) : 2915 - 2922
  • [42] Using Computational Intelligence to Identify Performance Bottlenecks in a Computer System
    Ahmed, Faraz
    Shahzad, Farrukh
    Farooq, Muddassar
    PARALLEL PROBLEMS SOLVING FROM NATURE - PPSN XI, PT I, 2010, 6238 : 304 - 313
  • [43] Predicting data cache misses in non-numeric applications through correlation profiling
    Mowry, TC
    Luk, CK
    THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, : 314 - 320
  • [44] Cache Aging Reduction with Improved Performance using Dynamically Re-sizable Cache
    Mahmood, Haroon
    Poncino, Massimo
    Macii, Enrico
    2014 DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION (DATE), 2014,
  • [45] An Adaptive Cache Management Using Dual LRU Stacks to Improve Buffer Cache Performance
    Wan, Shenggang
    Cao, Qiang
    He, Xubin
    Xie, Changsheng
    Wu, Chentao
    2008 IEEE INTERNATIONAL PERFORMANCE, COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC 2008), 2008, : 43 - 50
  • [46] 基于OPT Cache替换Profiling的Cache提示生成
    田兴彦
    黄春
    陈火旺
    计算机工程, 2005, (20) : 85 - 87
  • [47] Understanding why correlation profiling improves the predictability of data cache misses in nonnumeric applications
    Mowry, TC
    Luk, CK
    IEEE TRANSACTIONS ON COMPUTERS, 2000, 49 (04) : 369 - 384
  • [48] ACDC: Small, Predictable and High-Performance Data Cache
    Segarra, Juan
    Rodriguez, Clemente
    Gran, Ruben
    Aparicio, Luis C.
    Vinals, Victor
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2015, 14 (02) : 38
  • [49] Memory organization for improved data cache performance in embedded processors
    Panda, PR
    Dutt, ND
    Nicolau, A
    9TH INTERNATIONAL SYMPOSIUM ON SYSTEMS SYNTHESIS, PROCEEDINGS, 1996, : 90 - 95
  • [50] Cache-mesh, a Dynamics Data Structure for Performance Optimization
    Nguyen, Tuan T.
    Dahl, Vedrana A.
    Baerentzen, J. Andreas
    26TH INTERNATIONAL MESHING ROUNDTABLE, (IMR26 2017), 2017, 203 : 193 - 205