Automatic measurement of instruction cache capacity

被引:4
|
作者
Yotov, Kamen [1 ]
Jackson, Sandra [1 ]
Steele, Tyler [1 ]
Pingali, Keshav [1 ]
Stodghill, Paul [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
D O I
10.1007/978-3-540-69330-7_16
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
There is growing interest in autonomic computing systems that can optimize their own behavior on different platforms without manual intervention. Examples of successful self-optimizing systems are ATLAS, which generates Basic Linear Algebra Subroutine (BLAS) Libraries, and FFTW, which generates FFT libraries. Self-optimizing systems may need the values of hardware parameters such as the number of registers of various types and the capacities of caches at various levels. For example, ATLAS uses the capacity of the L1 cache and the number of registers in determining the size of cache tiles and register tiles. We have built a system called X-Ray(1), which uses micro-benchmarks to measure such parameter values automatically. The micro-benchmarks currently implemented in X-Ray can determine the latency of various instructions, the existence of important instructions like fused multiply-add, the number of registers of various kinds, and parameters of the memory hierarchy. In this paper, we discuss how X-Ray determines the capacity of the instruction cache (I-cache), which is needed for important optimizations such as loop unrolling. We present the micro-benchmark used in X-Ray to measure I-cache capacity, the experimental methodology used to obtain accurate estimates, and experimental results on a large number of current platforms.
引用
收藏
页码:230 / +
页数:3
相关论文
共 50 条
  • [21] Energy-aware instruction cache design using small trace cache
    Kim, J. M.
    Chung, S. W.
    Kim, C. H.
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2010, 4 (04): : 293 - 305
  • [22] PP-cache: A partitioned power-aware instruction cache architecture
    Kim, Cheol Hong
    Chung, Sung Woo
    Jhon, Chu Shik
    MICROPROCESSORS AND MICROSYSTEMS, 2006, 30 (05) : 268 - 279
  • [23] Instruction Cache Tuning for Embedded Multitasking Applications
    Dash, Santanu Kumar
    Srikanthan, Thambipillai
    RSP 2009: TWENTIETH IEEE/IFIP INTERNATIONAL SYMPOSIUM ON RAPID SYSTEM PROTOTYPING, PROCEEDINGS: SHORTENING THE PATH FROM SPECIFICATION TO PROTOTYPE, 2009, : 152 - 158
  • [24] CACHE ENHANCEMENT FOR STORE MULTIPLE INSTRUCTION.
    Capozzi, A.J.
    Kelley, W.J.
    Wassel, E.R.
    IBM technical disclosure bulletin, 1984, 27 (7 A): : 3943 - 3944
  • [25] Energy consumption measurement technique for automatic instruction set characterization of embedded processors
    Wendt, M.
    Grumer, M.
    Steger, C.
    Weiss, R.
    Neffe, U.
    Muehlberger, A.
    2007 IEEE INSTRUMENTATION & MEASUREMENT TECHNOLOGY CONFERENCE, VOLS 1-5, 2007, : 1419 - +
  • [26] Cache Simulation for Instruction Set Simulator QEMU
    Tran Van Dung
    Taniguchi, Ittetsu
    Tomiyama, Hiroyuki
    2014 IEEE 12TH INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING (DASC)/2014 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTING (EMBEDDEDCOM)/2014 IEEE 12TH INTERNATIONAL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING (PICOM), 2014, : 441 - +
  • [27] Large-Capacity and High-Speed Instruction Cache Based on Divide-by-2 Memory Banks
    Qing-Qing Li
    Zhi-Guo Yu
    Yi Sun
    Jing-He Wei
    Xiao-Feng Gu
    Journal of Electronic Science and Technology, 2021, (04) : 335 - 349
  • [28] Large-Capacity and High-Speed Instruction Cache Based on Divide-by-2 Memory Banks
    QingQing Li
    ZhiGuo Yu
    Yi Sun
    JingHe Wei
    XiaoFeng Gu
    Journal of Electronic Science and Technology, 2021, 19 (04) : 335 - 349
  • [29] Instruction Cache Prediction Using Bayesian Networks
    Bartlett, Mark
    Bate, Iain
    Cussens, James
    ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 1099 - 1100
  • [30] BRANCH-PROCESSING INSTRUCTION CACHE.
    Anon
    IBM technical disclosure bulletin, 1986, 29 (01): : 357 - 359