A programmable co-processor for profiling

被引:16
|
作者
Zilles, CB [1 ]
Sohi, GS [1 ]
机构
[1] Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
关键词
D O I
10.1109/HPCA.2001.903267
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Aggressive program optimization requires accurate profile information, but such accuracy requires many samples to be collected. We explore a novel profiling architecture that reduces the overhead of collecting each sample by including a programmable co-processor that analyzes a stream of profile samples generated by a microprocessor From this stream of samples, the co-processor can detect correlations between instructions (e.g., memory dependence profiling) as well as those between different dynamic instances of the same instruction (e.g., value profiling). The profiler's programmable nature allows a broad range of data to be extracted, post-processed and formatted as well as provides the flexibility to tailor the profiling application to the program under test. Because the co-processor is specialized for profiling, it can execute profiling applications more efficiently than a general-purpose processor The co-processor should not significantly impact the cost or performance of the main processor because it can be implemented using a small number of transistors at the chip's periphery We demonstrate the proposed design through a detailed evaluation of load value profiling. Our implementation quickly and accurately estimates the value invariance of loads, with rime overhead roughly proportional to the size of the instruction working set of the program. This algorithm demonstrates a number of general techniques for profiling, including: estimating the completeness of a profile, a means to focus profiling on particular instructions management of profiling resources.
引用
收藏
页码:241 / 252
页数:12
相关论文
共 50 条
  • [31] DReAC: A novel dynamically reconfigurable co-processor
    Song, Yu-Kun
    Gao, Ming-Lun
    Deng, Hong-Hui
    Wang, Rui
    Hu, Yong-Hua
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2007, 35 (05): : 833 - 837
  • [32] FPGA prototype of the REALJava']Java co-processor
    Sannti, Tero
    Tyystjaervi, Joonas
    Plosila, Juha
    2007 INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP PROCEEDINGS, 2007, : 70 - +
  • [33] A Unified Co-Processor Architecture for Matrix Decomposition
    Dou, Yong
    Zhou, Jie
    Wu, Gui-Ming
    Jiang, Jing-Fei
    Lei, Yuan-Wu
    Ni, Shi-Ce
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (04) : 874 - 885
  • [34] Designing a binary neural network co-processor
    Freeman, M
    Austin, J
    DSD 2005: 8th Euromicro Conference on Digital System Design, Proceedings, 2005, : 223 - 226
  • [35] Estimating the utilization of embedded FPGA co-processor
    Qu, Y
    Soininen, JP
    EUROMICRO SYMPOSIUM ON DIGITAL SYSTEM DESIGN, PROCEEDINGS, 2003, : 214 - 221
  • [36] Design of Bitstream Co-processor for Multimedia Applications
    Gao, Yingke
    Huan, Ying
    Xue, Zhiyuan
    Zhang, Tiejun
    Wang, Donghui
    Hou, Chaohuan
    2013 IEEE 11TH INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING (DASC), 2013, : 227 - 230
  • [37] Design of packet classification co-processor with FPGA
    Wang, YG
    Yan, TX
    ESA '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS AND APPLICATIONS, 2005, : 88 - 94
  • [38] A CO-PROCESSOR FOR UNIFICATION IN PROLOG - THE MICROPROGRAMMING LEVEL
    DEBLASI, M
    GENTILE, A
    MICROPROCESSING AND MICROPROGRAMMING, 1988, 23 (1-5): : 143 - 147
  • [39] Viterbi decoding on a co-processor architecture with vector parallelism
    Engin, N
    van Berkel, K
    SIPS 2003: IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 2003, : 334 - 339
  • [40] A case study for formal verification of a Timing Co-Processor
    Rodrigues, Cristiano
    LATW: 2009 10TH LATIN AMERICAN TEST WORKSHOP, 2009, : 43 - 48