GPU implementation of a parallel two-list algorithm for the subset-sum problem

被引:15
|
作者
Wan, Lanjun [1 ]
Li, Kenli [1 ,2 ]
Liu, Jing [1 ]
Li, Keqin [1 ,2 ,3 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Natl Supercomp Ctr Changsha, Changsha 410082, Hunan, Peoples R China
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
来源
基金
中国国家自然科学基金;
关键词
CUDA; GPU implementation; knapsack problem; parallel two-list algorithm; subset-sum problem; KNAPSACK-PROBLEM;
D O I
10.1002/cpe.3201
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The subset-sum problem is a well-known non-deterministic polynomial-time complete (NP-complete) decision problem. This paper proposes a novel and efficient implementation of a parallel two-list algorithm for solving the problem on a graphics processing unit (GPU) using Compute Unified Device Architecture (CUDA). The algorithm is composed of a generation stage, a pruning stage, and a search stage. It is not easy to effectively implement the three stages of the algorithm on a GPU. Ways to achieve better performance, reasonable task distribution between CPU and GPU, effective GPU memory management, and CPU-GPU communication cost minimization are discussed. The generation stage of the algorithm adopts a typical recursive divide-and-conquer strategy. Because recursion cannot be well supported by current GPUs with compute capability less than 3.5, a new vector-based iterative implementation mechanism is designed to replace the explicit recursion. Furthermore, to optimize the performance of the GPU implementation, this paper improves the three stages of the algorithm. The experimental results show that the GPU implementation has much better performance than the CPU implementation and can achieve high speedup on different GPU cards. The experimental results also illustrate that the improved algorithm can bring significant performance benefits for the GPU implementation. Copyright (C) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:119 / 145
页数:27
相关论文
共 50 条
  • [1] Efficient Parallelization of a Two-List Algorithm for the Subset-Sum Problem on a Hybrid CPU/GPU Cluster
    Kang, Letian
    Wan, Lanjun
    Li, Kenli
    2014 SIXTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2014, : 93 - 98
  • [2] An optimal and scalable parallelization of the two-list algorithm for the subset-sum problem
    Sanches, C. A. A.
    Soma, N. Y.
    Yanasse, H. H.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 176 (02) : 870 - 879
  • [3] A novel cooperative accelerated parallel two-list algorithm for solving the subset-sum problem on a hybrid CPU-GPU cluster
    Wan, Lanjun
    Li, Kenli
    Li, Keqin
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 97 : 112 - 123
  • [4] A parallel two-list algorithm for the knapsack problem
    Lou, DC
    Chang, CC
    PARALLEL COMPUTING, 1997, 22 (14) : 1985 - 1996
  • [5] A low-space algorithm for the subset-sum problem on GPU
    Curtis, V. V.
    Sanches, C. A. A.
    COMPUTERS & OPERATIONS RESEARCH, 2017, 83 : 120 - 124
  • [6] Parallel two-list algorithm for the knapsack problem
    Natl Chung Cheng Univ, Chiayi, Taiwan
    Parallel Comput, 14 (1985-1996):
  • [7] An efficient solution to the subset-sum problem on GPU
    Curtis, V. V.
    Sanches, C. A. A.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (01): : 95 - 113
  • [8] A FAST APPROXIMATION ALGORITHM FOR THE SUBSET-SUM PROBLEM
    GENS, G
    LEVNER, E
    INFOR, 1994, 32 (03) : 143 - 148
  • [9] An improved balanced algorithm for the subset-sum problem
    Curtis, V. V.
    Sanches, C. A. A.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2019, 275 (02) : 460 - 466
  • [10] New algorithm for dense subset-sum problem
    Chaimovich, M
    ASTERISQUE, 1999, (258) : 363 - 373