Toward a new approach for sorting extremely large data files in the big data era

被引:6
|
作者
Shatnawi, Ali [1 ]
AlZahouri, Yathrip [1 ]
Shehab, Mohammed A. [1 ]
Jararweh, Yaser [1 ]
Al-Ayyoub, Mahmoud [1 ]
机构
[1] Jordan Univ Sci & Technol, Box 3030, Irbid 22110, Jordan
关键词
Big data; Sorting; External merge sort; Large file processing; Hybrid CPU-GPU; ALGORITHMS;
D O I
10.1007/s10586-018-2860-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The extensive amount of data and contents generated today will require a paradigm shift in processing and management techniques for these data. One of the important data processing operations is the data sorting. Using multiple passes in external merge sort has a great influence on speeding up the sorting of extremely large data files. Since in large files, the swapping time is dominant in many applications, algorithms that minimize the swapping operations are normally superior to those which only focus on CPU time optimizations. In sorting extremely large files, external algorithms, such as the merge sort, are normally used. It is shown that using multiple passes over the data set, as proposed in our algorithm, has resulted in a great improvement in the number of swaps, thus, reducing the overall sorting time. Moreover, the proposed technique is suitable to be used with the emerging parallelization techniques such as GPUs. The reported results show the superiority of the proposed technique for "CPU only" and hybrid CPU-GPU implementations.
引用
收藏
页码:819 / 828
页数:10
相关论文
共 50 条
  • [41] Epidemiology in the Era of Big Data
    Mooney, Stephen J.
    Westreich, Daniel J.
    El-Sayed, Abdulrahman M.
    [J]. EPIDEMIOLOGY, 2015, 26 (03) : 390 - 394
  • [42] Insurance in Big Data Era
    Xie Dongzhou
    Lin Sha
    [J]. PROCEEDINGS OF THE 2015 CHINA INTERNATIONAL CONFERENCE ON INSURANCE AND RISK MANAGEMENT, 2015, : 90 - 103
  • [43] Ethics in the Era of Big Data
    Schaefer, G. Owen
    [J]. ASIAN BIOETHICS REVIEW, 2019, 11 (02) : 169 - 171
  • [44] GIS in the Era of Big Data
    Goodchild, Michael F.
    [J]. CYBERGEO-EUROPEAN JOURNAL OF GEOGRAPHY, 2016,
  • [45] Surviving in the era of "Big Data"
    Kwon, Seog-Woon
    [J]. BLOOD RESEARCH, 2013, 48 (03) : 167 - 168
  • [46] The Era of Big Spatial Data
    Eldawy, Ahmed
    Mokbel, Mohamed F.
    [J]. 2015 13TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2015, : 42 - 49
  • [47] Ethics in the Era of Big Data
    G. Owen Schaefer
    [J]. Asian Bioethics Review, 2019, 11 : 169 - 171
  • [48] The Era of Big Spatial Data
    Eldawy, Ahmed
    Mokbel, Mohamed F.
    [J]. 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1424 - 1427
  • [49] Policing in the Era of Big Data
    Ridgeway, Greg
    [J]. ANNUAL REVIEW OF CRIMINOLOGY, VOL 1, 2018, 1 : 401 - 419
  • [50] The Era of Big Spatial Data
    Eldawy, Ahmed
    Mokbel, Mohamed F.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (12): : 1992 - 1995