Skeletons and asynchronous RPC for embedded data and task parallel image processing

被引:2
|
作者
Caarls, Wouter [1 ]
Jonker, Pieter
Corporaal, Henk
机构
[1] Delft Univ Technol, Quantitat Imaging Grp, Delft, Netherlands
[2] Eindhoven Univ Technol, Fac Elect Engn, NL-5600 MB Eindhoven, Netherlands
来源
关键词
design space exploration; heterogeneous architectures; constrained architectures; algorithmic skeletons; remote procedure call; futures; run-time scheduling;
D O I
10.1093/ietisy/e89-d.7.2036
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Developing embedded parallel image processing applications is usually a very hardware-dependent process, often using the single instruction multiple data (SIMD) paradigm, and requiring deep knowledge of the processors used. Furthermore, the application is tailored to a specific hardware platform, and if the chosen hardware does not meet the requirements, it must be rewritten for a new platform. We have proposed the use of design space exploration [9] to find the most suitable hardware platform for a certain application. This requires a hardware-independent program, and we use algorithmic skeletons [5] to achieve this, while exploiting the data parallelism inherent to low-level image processing. However, since different operations run best on different kinds of processors, we need to exploit task parallelism as well. This paper describes how we exploit task parallelism using an asynchronous remote procedure call (RPC) system, optimized for low-memory and sparsely connected systems such as smart cameras. It uses a futures [16]-like model to present a normal imperative C-interface to the user in which the skeleton calls are implicitly parallelized and pipelined. Simulation provides the task dependency graph and performance numbers for the mapping, which can be done at run time to facilitate data dependent branching. The result is an easy to program, platform independent framework which shields the user from the parallel implementation and mapping of his application, while efficiently utilizing on-chip memory and interconnect bandwidth.
引用
收藏
页码:2036 / 2043
页数:8
相关论文
共 50 条
  • [31] Efficient parallel skeletons for nested data structures
    Takahashi, T
    Iwasaki, H
    Hu, ZJ
    [J]. PDPTA'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, 2001, : 728 - 734
  • [32] Performance analysis of massively parallel embedded hardware architectures for retinal image processing
    Nieto, Alejandro
    Brea, Victor
    Vilarino, David L.
    Osorio, Roberto R.
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2011,
  • [33] Performance analysis of massively parallel embedded hardware architectures for retinal image processing
    Alejandro Nieto
    Victor Brea
    David L Vilariño
    Roberto R Osorio
    [J]. EURASIP Journal on Image and Video Processing, 2011
  • [34] Data Augmentation Techniques For Expanding The Dataset In The Task Of Image Processing
    Rrmoku, Blerina
    Qehaja, Besnik
    [J]. 2022 29TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2022,
  • [35] Data Parallel Algorithmic Skeletons with Accelerator Support
    Steffen Ernsting
    Herbert Kuchen
    [J]. International Journal of Parallel Programming, 2017, 45 : 283 - 299
  • [36] DATA ALLOCATION STRATEGIES FOR PARALLEL IMAGE-PROCESSING ALGORITHMS
    MARIONPOTY, V
    MIGUET, S
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 1995, 9 (04) : 615 - 634
  • [37] Parallel processing for image and video processing
    AXIS, Institut d'Electronique Fondamentale, Université Paris-Sud, Bâtiment 220, 91405 Orsay Cedex, France
    不详
    [J]. Parallel Comput, 2008, 12 (693):
  • [38] Embedded image processing platform
    [J]. Duan, Z. (cliff_duan@sina.com), 1600, Sun Yat-sen (Zhongshan) University (01):
  • [39] MRI Parallel Processing for Embedded Visualization
    Beniani, Manuel
    Sami, Mariagiovanna
    Pau, Danilo Pietro
    [J]. 2013 IEEE THIRD INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - BERLIN (ICCE-BERLIN), 2013,
  • [40] Java']Java Implementation of Data Parallel Skeletons on GPUs
    Ernsting, Steffen
    Kuchen, Herbert
    [J]. PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 155 - 164