GPU enhanced parallel computing for large scale data clustering

被引:15
|
作者
Cui, Xiaohui [1 ,4 ]
St Charles, Jesse [3 ]
Potok, Thomas [2 ]
机构
[1] Oak Ridge Natl Lab, Dept Energy, Oak Ridge, TN 37831 USA
[2] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[4] New York Inst Technol, New York, NY 10023 USA
关键词
GPU; Swarm intelligence; Data clustering; CUDA;
D O I
10.1016/j.future.2012.07.009
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Analyzing and clustering large scale data set is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. One limitation of this method of data clustering is its complexity O(n(2)). As the number of data and feature dimensions grows, it becomes increasingly difficult to generate results in a reasonable amount of time. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. In this paper, we have conducted research to exploit this architecture and apply its strengths to the flocking based high dimension data clustering problem. Using the CUDA platform from NVIDIA, we developed a Multiple Species Data Flocking implementation to be run on the NVIDIA GPU. Performance gains ranged from 30 to 60 times improvement of the GPU over the 3GHz CPU implementation. (c) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:1736 / 1741
页数:6
相关论文
共 50 条
  • [1] Regularized focusing inversion for large-scale gravity data based on GPU parallel computing
    WANG Haoran
    DING Yidan
    LI Feida
    LI Jing
    [J]. Global Geology, 2019, 22 (03) : 179 - 187
  • [2] Large-scale parallel data clustering
    Judd, D
    McKinley, PK
    Jain, AK
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (08) : 871 - 876
  • [3] Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing
    Takizawa, Hiroyuki
    Kobayashi, Hiroaki
    [J]. JOURNAL OF SUPERCOMPUTING, 2006, 36 (03): : 219 - 234
  • [4] Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing
    Hiroyuki Takizawa
    Hiroaki Kobayashi
    [J]. The Journal of Supercomputing, 2006, 36 : 219 - 234
  • [5] Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing
    Liu, Qiang
    Qin, Yi
    Li, Guodong
    [J]. WATER, 2018, 10 (05):
  • [6] Parallel Implementation of P Systems for Data Clustering on GPU
    Jin, Jie
    Liu, Hui
    Wang, Fengjuan
    Peng, Hong
    Wang, Jun
    [J]. BIO-INSPIRED COMPUTING - THEORIES AND APPLICATIONS, BIC-TA 2015, 2015, 562 : 200 - 211
  • [7] Parallel Clustering Algorithm for Large-Scale Biological Data Sets
    Wang, Minchao
    Zhang, Wu
    Ding, Wang
    Dai, Dongbo
    Zhang, Huiran
    Xie, Hao
    Chen, Luonan
    Guo, Yike
    Xie, Jiang
    [J]. PLOS ONE, 2014, 9 (04):
  • [8] Petroleum Geoscience Big Data and GPU Parallel Computing
    Han, Fei
    Sun, Sam Z.
    [J]. 2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 292 - 293
  • [9] A Workflow for Parallel and Distributed Computing of Large-Scale Genomic Data
    Choi, Hyun-Hwa
    Kim, Byoung-Seob
    Ahn, Shin-Young
    Bae, Seung-Jo
    [J]. 2013 8TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2013, : 215 - 218
  • [10] Accelerating K-Means Clustering with Parallel Implementations and GPU computing
    Bhimani, Janki
    Leeser, Miriam
    Mi, Ningfang
    [J]. 2015 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2015,