Locating Cache Performance Bottlenecks Using Data Profiling

被引:0
|
作者
Pesterev, Aleksey [1 ]
Zeldovich, Nickolai [1 ]
Morris, Robert T. [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
关键词
Cache Misses; Data Profiling; Debug Registers; Statistical Profiling;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Effective use of CPU data caches is critical to good performance, but poor cache use patterns are often hard to spot using existing execution profiling tools. Typical profilers attribute costs to specific code locations. The costs due to frequent cache misses on a given piece of data, however, may be spread over instructions throughout the application. The resulting individually small costs at a large number of instructions can easily appear insignificant in a code profiler's output. DProf helps programmers understand cache miss costs by attributing misses to data types instead of code. Associating cache misses with data helps programmers locate data structures that experience misses in many places in the application's code. DProf introduces a number of new views of cache miss data, including a data profile, which reports the data types with the most cache misses, and a data flow graph, which summarizes how objects of a given type are accessed throughout their lifetime, and which accesses incur expensive cross-CPU cache loads. We present two case studies of using DProf to find and fix cache performance bottlenecks in Linux. The improvements provide a 16-57% throughput improvement on a range of memcached and Apache workloads.
引用
收藏
页码:335 / 348
页数:14
相关论文
共 50 条
  • [21] Exploring Core and Cache Hierarchy Bottlenecks in Graph Processing Workloads
    Basak, Abanti
    Hu, Xing
    Li, Shuangchen
    Oh, Sang Min
    Xie, Yuan
    IEEE COMPUTER ARCHITECTURE LETTERS, 2018, 17 (02) : 197 - 200
  • [22] PERFORMANCE OF HASHED CACHE DATA MIGRATION SCHEMES ON MULTICOMPUTERS
    HIRANANDANI, S
    SALTZ, J
    MEHROTRA, P
    BERRYMAN, H
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 12 (04) : 415 - 422
  • [23] Design considerations of high performance data cache with prefetching
    Chi, CH
    Yuan, YL
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 1243 - 1250
  • [24] Estimating cache performance for sequential and data parallel programs
    Fahringer, T
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1997, 1225 : 840 - 849
  • [25] Improving Cache Performance with Balanced Tag and Data Paths
    Peir, J.-K.
    Hsu, W. W.
    Young, H.
    Ong, S.
    Computer Architecture News, 24
  • [26] Performance and power evaluation of an intelligently adaptive data cache
    Benítez, D
    Moure, JC
    Rexachs, DI
    Luque, E
    HIGH PERFORMANCE COMPUTING - HIPC 2005, PROCEEDINGS, 2005, 3769 : 363 - 375
  • [27] Study of cache performance in distributed environment for data processing
    Makatun, Dzmitry
    Lauret, Jerome
    Sumbera, Michal
    15TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH (ACAT2013), 2014, 523
  • [28] High performance cache management for sequential data access
    Rahm, Erhard
    Ferguson, Donald
    Performance Evaluation Review, 1992, 20 (01):
  • [29] Worst-Case Performance Guaranteed Data Cache
    Huangfu, Yijie
    Zhang, Wei
    2014 IEEE INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2014,
  • [30] Improving cache performance with balanced tag and data paths
    Peir, JK
    Hsu, WW
    Young, H
    Ong, S
    ACM SIGPLAN NOTICES, 1996, 31 (09) : 268 - 278