A Parallel Genetic Algorithm With Dispersion Correction for HW/SW Partitioning on Multi-Core CPU and Many-Core GPU

被引:22
|
作者
Hou, Neng [1 ]
He, Fazhi [1 ,2 ]
Zhou, Yi [3 ]
Chen, Yilin [1 ]
Yan, Xiaohu [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, State Key Lab Software Engn, Wuhan 430072, Hubei, Peoples R China
[2] State Key Lab Digital Mfg Equipment & Technol, Wuhan 430074, Hubei, Peoples R China
[3] Wuhan Univ Sci & Technol, Sch Informat Sci & Engn, Engn Res Ctr Met Automat & Measurement Technol, Wuhan 430081, Hubei, Peoples R China
来源
IEEE ACCESS | 2018年 / 6卷
基金
美国国家科学基金会;
关键词
Hardware/software co-design; heuristic method; genetic algorithm; multi-core CPU; many-core GPU; HARDWARE-SOFTWARE COSYNTHESIS; HARDWARE/SOFTWARE; OPTIMIZATION; SEGMENTATION; TRACKING; DESIGN;
D O I
10.1109/ACCESS.2017.2776295
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In hardware/software (HW/SW) co-design, hardware/software partitioning is an essential step in that it determines which components to be implemented in hardware and which ones in software. Most of HW/SW partitioning problems are NP hard. For large-size problems, heuristic methods have to be utilized. This paper presents a parallel genetic algorithm with dispersion correction for HW/SW partitioning on CPU-GPU. First, an enhanced genetic algorithm with dispersion correction is presented. The under-constraint individuals are marched to feasible region step by step. In this way, the intensification can be enhanced as well as the constraint problem can be handled. Second, the individuals performing costs computation and dispersion correction are run in parallel. For a given problem size, the overall run-time can be reduced while the diversity of genetic algorithm can be kept. Third, especially when a number of under-constraint individuals should be corrected in an irregular way, the computation process is complicated and the computation overhead is large. Therefore, we present a novel parallel strategy by leveraging the parallel power of a multi-core CPU and that of a many-core GPU. The proposed strategy computes the costs of each individual in parallel on GPU and corrects the under-constraint individuals in parallel on the multi-core CPU. In this way, a highly efficient parallel computing can be achieved in which dozens of irregular correction computing steps are mapped to the multi-core CPU and thousands of regular cost computing steps are mapped to the many-core GPU. Fourth, at each iteration of the hybrid parallel strategy, the solution vectors of individuals are transferred to the GPU and their costs are transferred back to the CPU. In order to further improve the efficiency of proposed algorithm, we propose an asynchronous transfer pattern (stream concurrency pattern) for CPU-GPU, in which the transfer process and computation process are overlapped and eventually the overall run-time can be reduced further. Finally, the experiments show that the solution quality obtained by our method is competitive with existing heuristic methods in reasonable time. Furthermore, by combining with the multi-core CPU and many-core GPU, the running time of the proposed method is efficiently reduced.
引用
收藏
页码:883 / 898
页数:16
相关论文
共 50 条
  • [21] Parallel Monte Carlo Tree Search from Multi-core to Many-core Processors
    Mirsoleimani, S. Ali
    Plaat, Aske
    van den Herik, Jaap
    Vermaseren, Jos
    2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 3, 2015, : 77 - 83
  • [22] Parallel Implementations of the Cooperative Particle Swarm Optimization on Many-core and Multi-core Architectures
    Nedjah, Nadia
    Calazan, Rogerio de M.
    Mourelle, Luiza de Macedo
    Wang, Chao
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2016, 44 (06) : 1173 - 1199
  • [23] A High Performance Parallel Ranking SVM with OpenCL on Multi-core and Many-core Platforms
    Zhu, Huming
    Li, Pei
    Zhang, Peng
    Luo, Zheng
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2019, 11 (01) : 17 - 28
  • [24] Ecosystems for the Development of Multi-Core and Many-Core SoC Models
    Wassal, Amr G.
    Abdelfattah, Moataz A.
    Ismail, Yehea I.
    2010 INTERNATIONAL CONFERENCE ON MICROELECTRONICS, 2010, : 264 - 267
  • [25] Revision of Relational Joins for Multi-Core and Many-Core Architectures
    Krulis, Martin
    Yaghob, Jakub
    DATESO 2011: DATABASES, TEXTS, SPECIFICATIONS, OBJECTS, 2011, 706 : 229 - 240
  • [26] Solving Matrix Equations on Multi-Core and Many-Core Architectures
    Benner, Peter
    Ezzatti, Pablo
    Mena, Hermann
    Quintana-Orti, Enrique S.
    Remon, Alfredo
    ALGORITHMS, 2013, 6 (04) : 857 - 870
  • [27] EXPLOITING MULTI-CORE AND MANY-CORE PARALLELISM FOR SUBSPACE CLUSTERING
    Datta, Amitava
    Kaur, Amardeep
    Lauer, Tobias
    Chabbouh, Sami
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2019, 29 (01) : 81 - 91
  • [28] Parallel Multi-Core CPU and GPU for Fast and Robust Medical Image Watermarking
    Hosny, Khalid M.
    Darwish, Mohamed M.
    Li, Kenli
    Salah, Ahmad
    IEEE ACCESS, 2018, 6 : 77212 - 77225
  • [29] RTL Test Generation on Multi-Core and Many-Core Architectures
    Varadarajan, Aravind Krishnan
    Hsiao, Michael S.
    2019 32ND INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2019 18TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID), 2019, : 100 - 105
  • [30] A Fine-Grained Parallel Particle Swarm Optimization on Many-core and Multi-core Architectures
    Nedjah, Nadia
    Calazan, Rogerio de Moraes
    Mourelle, Luiza de Macedo
    PARALLEL COMPUTING TECHNOLOGIES (PACT 2017), 2017, 10421 : 215 - 224