Global optimal partitioning of parallel loops for minimal data movement in limited memory embedded systems

被引:0
|
作者
Lin, J [1 ]
Lin, XL [1 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Embedded systems are often characterized by limited memory while many applications on these systems are memory-intensive. Reducing the overhead of data movement between global memory and distributed local memory in such a system is critical to the performance of these applications. In this paper, we propose a unified theoretical framework for automatically partitioning parallel loops to optimize the data movement on such systems. We first introduce the notion of data movement and build a simple but accurate data movement model to estimate the overhead of the data movement for the footprint. We then present an algorithm to derive an optimal loop partitioning to minimize the number of data movement across the loop nests. We have implemented the framework in a parallel compiler on VE16, a limited memory embedded commercial system, and the experiment results demonstrate the efficiency of the proposed method.
引用
收藏
页码:3 / 9
页数:7
相关论文
共 50 条