Simple randomized merge/sort on parallel disks

被引:44
|
作者
Barve, RD [1 ]
Grove, EF [1 ]
Vitter, JS [1 ]
机构
[1] DUKE UNIV,DEPT COMP SCI,DURHAM,NC 27708
关键词
I/O; external memory; disk; parallel disks; sorting; merge/sort; merging; forecasting; maximum occupancy; disk striping;
D O I
10.1016/S0167-8191(97)00015-X
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the problem of sorting a file of N records on the D-disk model of parallel I/O in which there are two sources of parallelism, Records are transferred to and from disk concurrently in blocks of B contiguous records. In each I/O operation, up to one block can be transferred to or from each of the D-disks in parallel, We propose a simple, efficient, randomized mergesort algorithm called SRM that uses a forecast-and-flush approach to overcome the inherent difficulties of simple merging on parallel disks, SRM exhibits a limited use of randomization and also has a useful deterministic version. Generalizing the technique of forecasting, our algorithm is able to read in, at any time, the 'right' block from any disk and using the technique of flushing, our algorithm evicts, without any I/O overhead, just the 'right' blocks from memory to make space for new ones to be read in. The disk layout of SRM is such that it enjoys perfect write parallelism, avoiding fundamental inefficiencies of previous mergesort algorithms, By analysis of generalized maximum occupancy problems we are able to derive an analytical upper bound on SRM's expected overhead valid for arbitrary inputs, The upper bound derived on expected I/O performance of SRM indicates that SRM is provably better than disk-striped mergesort (DSM) for realistic parameter values D, M and B. Average-case simulations show further improvement on the analytical upper bound. Unlike previously proposed optimal sorting algorithms, SRM outperforms DSM even when the number D of parallel disks is small.
引用
收藏
页码:601 / 631
页数:31
相关论文
共 50 条
  • [1] PARALLEL MERGE SORT
    COLE, R
    [J]. SIAM JOURNAL ON COMPUTING, 1988, 17 (04) : 770 - 785
  • [2] Simple randomized mergesort on parallel disks
    Duke Univ, Durham, United States
    [J]. Parallel Comput, 4-5 (601-631):
  • [3] Parallel merge sort with load balancing
    Jeon, M
    Kim, D
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2003, 31 (01) : 21 - 33
  • [4] Parallel Merge Sort with Double Merging
    Uyar, Ahmet
    [J]. 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), 2014, : 490 - 494
  • [5] Parallel Merge Sort with Load Balancing
    Minsoo Jeon
    Dongseung Kim
    [J]. International Journal of Parallel Programming, 2003, 31 : 21 - 33
  • [6] Optimized Pipelined Parallel Merge Sort on the Cell BE
    Keller, Joerg
    Kessler, Christoph W.
    [J]. EURO-PAR 2008 WORKSHOPS - PARALLEL PROCESSING, 2009, 5415 : 131 - +
  • [7] Data-Adapted Parallel Merge Sort
    Holke, Johannes
    Ruettgers, Alexander
    Klitz, Margrit
    Basermann, Achim
    [J]. EURO-PAR 2019: PARALLEL PROCESSING WORKSHOPS, 2020, 11997 : 388 - 399
  • [8] Simulation and visualization tools for teaching parallel merge sort
    Trahan, Robin
    Rodger, Susan
    [J]. SIGCSE Bulletin (Association for Computing Machinery, Special Interest Group on Computer Science Education), 1993, 25 (01): : 237 - 241
  • [9] Optimal parallel algorithm of merge sort based on OpenMP
    Shen Hailong
    [J]. MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 3400 - 3403
  • [10] Fully Flexible Parallel Merge Sort for Multicore Architectures
    Marszalek, Zbigniew
    Wozniak, Marcin
    Polap, Dawid
    [J]. COMPLEXITY, 2018,