Scalable Suffix Sorting on a Multicore Machine

被引：6

作者：

Xie, Jing Yi ^{[1
]}

Nong, Ge ^{[1
]}

Lao, Bin ^{[2
]}

Xu, Wentao ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510275, Guangdong, Peoples R China

[2] Guangdong Univ Foreign Studies, Sch Informat Sci & Technol, Guangzhou 510420, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2020年 / 69卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Sorting; Random access memory; Indexes; Multicore processing; Arrays; Task analysis; Big Data; Suffix sorting; algorithm design; multicore computer; ARRAY CONSTRUCTION;

D O I：

10.1109/TC.2020.2972546

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

A number of methods have been proposed for suffix sorting on internal memory of RAM and external memory of hard disks. The current best results for suffix sorting on internal or external memory are achieved by several algorithms using the induced sorting (IS) method in various ways. While these algorithms are efficient, the internal ones are much different from those external in terms of the algorithm designs. A scalable IS method that can be applied for suffix sorting on both internal and external memory is highly desired. This article proposes a blockwise IS method to facilitate pipelined access on internal memory and sequential I/Os on external memory. The detailed algorithm of using this method for a 4-stage pipeline with multiple threads is described, where multiple threads are applied to parallelize not only the pipelined stages of consecutive blocks but also the tasks within each stage wherever possible. This algorithm is evaluated by our experiments on a set of realistic and artificial datasets to achieve better overall time and space performance than the existing best results from pSACAK, pDSS and pKS. Beside sorting suffixes on internal memory in linear time, the proposed method can be ported to external memory for sorting massive suffixes in linear I/O complexity.

引用

页码：1364 / 1375

页数：12

共 50 条

[41] SMR: Scalable MapReduce for Multicore Systems
Zhang, Yu
Yu, Yufen
Chen, Jiankang
IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 684 - 691
[42] A Grammar Compression Algorithm based on Induced Suffix Sorting
Nogueira Nunes, Daniel Saad
Louza, Felipe A.
Gog, Simon
Ayala-Rincon, Mauricio
Navarro, Gonzalo
2018 DATA COMPRESSION CONFERENCE (DCC 2018), 2018, : 42 - 51
[43] Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision
Petersen, Felix
Borgelt, Christian
Kuehne, Hilde
Deussen, Oliver
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[44] Highly scalable computational algorithms on emerging parallel machine multicore architectures: development and implementation in CFD context
Kannan, R.
Harrand, V.
Lee, M.
Przekwas, A. J.
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, 2013, 73 (10) : 869 - 882
[45] Antisequential suffix sorting for BWT-based data compression
Baron, D
Bresler, Y
IEEE TRANSACTIONS ON COMPUTERS, 2005, 54 (04) : 385 - 397
[46] Optimal suffix sorting and LCP array construction for constant alphabets
Louza, Felipe A.
Gog, Simon
Telles, Guilherme P.
INFORMATION PROCESSING LETTERS, 2017, 118 : 30 - 34
[47] Building and Checking Suffix Array Simultaneously by Induced Sorting Method
Lao, Bin
Wu, Yi
Nong, Ge
Chan, Wai Hong
IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (04) : 756 - 765
[48] Suffix sorting via Shannon-Fano-Elias codes
Adjeroh, Don
Nan, Fei
DCC: 2008 DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2008, : 502 - 502
[49] Scalable RNA Sequencing on Clusters of Multicore Processors
Martinez, Hector
Barrachina, Sergio
Castillo, Maribel
Tarraga, Joaquin
Medina, Ignacio
Dopazo, Joaquin
Quintana-Orti, Enrique S.
2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 3, 2015, : 190 - 195
[50] Scalable Analysis of Multicore Data Reuse and Sharing
Pericas, Miquel
Taura, Kenjiro
Matsuoka, Satoshi
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 353 - 362

← 1 2 3 4 5 →