GPU implementation of a parallel two-list algorithm for the subset-sum problem

被引:15
|
作者
Wan, Lanjun [1 ]
Li, Kenli [1 ,2 ]
Liu, Jing [1 ]
Li, Keqin [1 ,2 ,3 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Natl Supercomp Ctr Changsha, Changsha 410082, Hunan, Peoples R China
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
来源
基金
中国国家自然科学基金;
关键词
CUDA; GPU implementation; knapsack problem; parallel two-list algorithm; subset-sum problem; KNAPSACK-PROBLEM;
D O I
10.1002/cpe.3201
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The subset-sum problem is a well-known non-deterministic polynomial-time complete (NP-complete) decision problem. This paper proposes a novel and efficient implementation of a parallel two-list algorithm for solving the problem on a graphics processing unit (GPU) using Compute Unified Device Architecture (CUDA). The algorithm is composed of a generation stage, a pruning stage, and a search stage. It is not easy to effectively implement the three stages of the algorithm on a GPU. Ways to achieve better performance, reasonable task distribution between CPU and GPU, effective GPU memory management, and CPU-GPU communication cost minimization are discussed. The generation stage of the algorithm adopts a typical recursive divide-and-conquer strategy. Because recursion cannot be well supported by current GPUs with compute capability less than 3.5, a new vector-based iterative implementation mechanism is designed to replace the explicit recursion. Furthermore, to optimize the performance of the GPU implementation, this paper improves the three stages of the algorithm. The experimental results show that the GPU implementation has much better performance than the CPU implementation and can achieve high speedup on different GPU cards. The experimental results also illustrate that the improved algorithm can bring significant performance benefits for the GPU implementation. Copyright (C) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:119 / 145
页数:27
相关论文
共 50 条
  • [31] Observations on optimal parallelizations of two-list algorithm
    Alonso Sanches, Carlos Alberto
    Soma, Nei Yoshihiro
    Yanasse, Horacio Hideki
    PARALLEL COMPUTING, 2010, 36 (01) : 65 - 67
  • [32] Solving the Subset-Sum problem by P systems with active membranes
    Mario J. Pérez Jiménez
    Agustín Riscos Núñez
    New Generation Computing, 2005, 23 : 339 - 356
  • [33] Solving the subset-sum problem by P systems with active membranes
    Jiménez, MJP
    Núñez, AR
    NEW GENERATION COMPUTING, 2005, 23 (04) : 339 - 356
  • [34] An efficient fully polynomial approximation scheme for the Subset-Sum Problem
    Kellerer, H
    Mansini, R
    Pferschy, U
    Speranza, MG
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (02) : 349 - 370
  • [35] Molecular solutions for the subset-sum problem on DNA-based supercomputing
    Chang, WL
    Ho, MSH
    Guo, M
    BIOSYSTEMS, 2004, 73 (02) : 117 - 130
  • [36] A subset-sum type formulation of a two-agent single-machine scheduling problem
    Avolio, Matteo
    Fuduli, Antonio
    INFORMATION PROCESSING LETTERS, 2020, 155
  • [37] A parallel Time/Processor tradeoff T.P=0(nlogM/M) for the subset-sum problem
    Chedid, FB
    PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 2001, : 478 - 481
  • [38] WORST-CASE ANALYSIS OF AN APPROXIMATION SCHEME FOR THE SUBSET-SUM PROBLEM
    FISCHETTI, M
    OPERATIONS RESEARCH LETTERS, 1986, 5 (06) : 283 - 284
  • [39] WORST-CASE ANALYSIS OF GREEDY ALGORITHMS FOR THE SUBSET-SUM PROBLEM
    MARTELLO, S
    TOTH, P
    MATHEMATICAL PROGRAMMING, 1984, 28 (02) : 198 - 205
  • [40] Approximate minimization algorithms for the 0/1 Knapsack and Subset-Sum Problem
    Güntzer, MM
    Jungnickel, D
    OPERATIONS RESEARCH LETTERS, 2000, 26 (02) : 55 - 66