Automatic measurement of instruction cache capacity

被引:4
|
作者
Yotov, Kamen [1 ]
Jackson, Sandra [1 ]
Steele, Tyler [1 ]
Pingali, Keshav [1 ]
Stodghill, Paul [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
D O I
10.1007/978-3-540-69330-7_16
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
There is growing interest in autonomic computing systems that can optimize their own behavior on different platforms without manual intervention. Examples of successful self-optimizing systems are ATLAS, which generates Basic Linear Algebra Subroutine (BLAS) Libraries, and FFTW, which generates FFT libraries. Self-optimizing systems may need the values of hardware parameters such as the number of registers of various types and the capacities of caches at various levels. For example, ATLAS uses the capacity of the L1 cache and the number of registers in determining the size of cache tiles and register tiles. We have built a system called X-Ray(1), which uses micro-benchmarks to measure such parameter values automatically. The micro-benchmarks currently implemented in X-Ray can determine the latency of various instructions, the existence of important instructions like fused multiply-add, the number of registers of various kinds, and parameters of the memory hierarchy. In this paper, we discuss how X-Ray determines the capacity of the instruction cache (I-cache), which is needed for important optimizations such as loop unrolling. We present the micro-benchmark used in X-Ray to measure I-cache capacity, the experimental methodology used to obtain accurate estimates, and experimental results on a large number of current platforms.
引用
收藏
页码:230 / +
页数:3
相关论文
共 50 条
  • [41] POWER2 INSTRUCTION CACHE UNIT
    BARREH, JI
    GOLLA, RT
    ARIMILLI, LB
    JORDAN, PJ
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1994, 38 (05) : 537 - 544
  • [42] Instruction cache prefetching directed by branch prediction
    Institute of Computer Science and Information Engineering, National Chiao Tung University, Hsinchu 30050, Taiwan
    IEE Proc Comput Digital Tech, 5 (241-246):
  • [43] AUTOMATIC PHYSIOTHERAPY INSTRUCTION
    GRAN, L
    DINSDALE, M
    MYKING, J
    ANAESTHESIA, 1976, 31 (05) : 662 - 665
  • [44] MEASUREMENT AND INSTRUCTION
    Keeler, L. W.
    JOURNAL OF EDUCATIONAL RESEARCH, 1935, 28 (07): : 493 - 495
  • [45] Decode filter cache for energy efficient instruction cache hierarchy in super scalar architectures
    Vivekanandarajah, K
    Srikanthan, T
    Bhattacharyya, S
    ASP-DAC 2004: PROCEEDINGS OF THE ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, 2004, : 373 - 379
  • [46] A Time-Predictable Instruction-Cache Architecture that Uses Prefetching and Cache Locking
    Cilku, Bekim
    Prokesch, Daniel
    Puschner, Peter
    2015 IEEE 18TH INTERNATIONAL SYMPOSIUM ON REAL-TIME DISTRIBUTED COMPUTING WORKSHOPS, 2015, : 74 - 79
  • [47] A parameterized automatic cache generator for FPGAs
    Yiannacouras, P
    Rose, J
    2003 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), PROCEEDINGS, 2003, : 324 - 327
  • [48] Compiler-directed instruction cache leakage optimization
    Zhang, W
    Hu, JS
    Degalahal, V
    Kandemir, M
    Vijaykrishnan, N
    Irwin, MJ
    35TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-35), PROCEEDINGS, 2002, : 208 - 218
  • [49] Probabilistic Instruction Cache Analysis using Bayesian Networks
    Bartlett, Mark
    Bate, Iain
    Cussens, James
    Kazakov, Dimitar
    2011 IEEE 17TH INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA 2011), VOL 1, 2011, : 233 - 242
  • [50] The direct-mapped instruction cache for ColdFire microprocessors
    Tirumala, AS
    Bibikar, VJ
    INTERNATIONAL CONFERENCE ON COMPUTER DESIGN - VLSI IN COMPUTERS AND PROCESSORS, PROCEEDINGS, 1996, : 288 - 292