A Wire Delay Scalable Stream Processor Architecture

被引:0
|
作者
Xu, Guang [1 ,2 ]
An, Hong [1 ,2 ]
Cong, Ming [1 ,2 ]
Wang, Fang [1 ,2 ]
Ren, Yongqing [1 ,2 ]
机构
[1] Univ Sci & Technol China, Dept Comp Sci & Technol, Hefei 230026, Peoples R China
[2] Chinese Acad Sci, Key Lab Comp Syst Architecture, Beijing 100080, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Growing on-chip wire delays will cause many future microarchitecture to be distributed. The centralized control and data transmission of the conventional stream processor need to be improved, the hardware resources within a single stream processor become tiles on one or more switched micronetworks. In this paper, we introduce the architecture of the tiled stream processor which aims to adapt to the increasing wire resistance. The tiled stream processor consists of tile arrays, the distributed control and data network which connect tiles. The tile arrays include the five type of reused tiles, the control packet transferred in the control network traverse the tile in a cycle. The architecture of the tiled stream processor supports explicit data management in the hardware way and include two level register hierarchy which are controlled by the software to capture the data locality. The tiled stream processor use the stream programming model which are StreamC/KernelC language, the kernel microcode executed in the tiled arrays are statically scheduled into instruction blocks and execute dynamically in an dataflow order. Finally, we discuss features that affect the kernel performance, with the one cycle routing delay the simulator can achieve an average of 10 IPC in eight kernels.
引用
下载
收藏
页码:132 / +
页数:2
相关论文
共 50 条
  • [41] Implementation and evaluation of conditional stream in stream processor
    Sui Bingcai
    Xing Zuocheng
    Ma Anguo
    Huang Ping
    Zhang Minxuan
    21ST EUROPEAN CONFERENCE ON MODELLING AND SIMULATION ECMS 2007: SIMULATIONS IN UNITED EUROPE, 2007, : 625 - +
  • [42] A BALANCED SCALABLE PARALLEL PROCESSOR
    PILPEL, S
    VLSI SYSTEMS DESIGN, 1987, 8 (03): : 80 - &
  • [43] A Scalable Parallel XQuery Processor
    Carman, E. Preston, Jr.
    Westmann, Till
    Borkar, Vinayak R.
    Carey, Michael J.
    Tsotras, Vassilis J.
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 164 - 173
  • [44] A network architecture for providing per-flow delay guarantees with scalable core
    Chaporkar, P
    Kuri, J
    JOURNAL OF HIGH SPEED NETWORKS, 2002, 12 (3-4) : 87 - 109
  • [45] VLSI architecture study of a real-time scalable optical flow processor for video segmentation
    Minegishi, N
    Miyakoshi, J
    Kuroda, Y
    Katagiri, T
    Fukuyama, Y
    Yamamoto, R
    Miyama, M
    Imamura, K
    Hashimoto, H
    Yoshimoto, M
    IEICE TRANSACTIONS ON ELECTRONICS, 2006, E89C (03): : 230 - 242
  • [46] A scalable single-chip multi-processor architecture with on-chip RTOS kernel
    Theelen, BD
    Verschueren, AC
    Suárez, VVR
    Stevens, MPJ
    Nuñez, A
    JOURNAL OF SYSTEMS ARCHITECTURE, 2003, 49 (12-15) : 619 - 639
  • [47] SplitJoin: A Scalable, Low-latency Stream Join Architecture with Adjustable Ordering Precision
    Najafi, Mohammadreza
    Sadoghi, Mohammad
    Jacobsen, Hans-Arno
    PROCEEDINGS OF USENIX ATC '16: 2016 USENIX ANNUAL TECHNICAL CONFERENCE, 2016, : 493 - 505
  • [48] A delay spread based low power reconfigurable FFT processor architecture for wireless receivers
    Hasan, M
    Arslan, T
    Thompson, JS
    INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP, PROCEEDINGS, 2003, : 135 - 138
  • [49] Saving energy on processor micro-architecture level for big data stream mobile computing
    Liu, Zhiguo
    Zhang, Ni
    Tang, Qiu
    Song, Ningning
    Yu, Zengming
    Zhang, Hongbin
    2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 7 - 13
  • [50] A stream processor development platform
    Serebrin, B
    Owens, JD
    Chen, CH
    Crago, SP
    Kapasi, UJ
    Khailany, B
    Mattson, P
    Namkoong, J
    Rixner, S
    Dally, WJ
    ICCD'2002: IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS AND PROCESSORS, PROCEEDINGS, 2002, : 303 - 308