High-performance data mining with intelligent SSD

被引:2
|
作者
Jo, Yong-Yeon [1 ]
Kim, Sang-Wook [1 ]
Cho, Sung-Woo [1 ]
Bae, Duck-Ho [1 ]
Oh, Hyunok [2 ]
机构
[1] Hanyang Univ, Dept Comp & Software, Seoul, South Korea
[2] Hanyang Univ, Dept Informat Syst, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Intelligent SSD; Simulator-based evaluation; Collaborative processing; Heterogeneous scheduling;
D O I
10.1007/s10586-017-0789-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An intuitive way to process the big data efficiently is to reduce the volume of data transferred over the storage interface to a host system. This is the reason that the notion of intelligent SSD (iSSD) was proposed to give processing power to SSD. There is rich literature on iSSD, however, its real implementation has not been provided to the public yet. Most prior work aims to quantify the benefits of iSSD with analytical modeling. In this paper, we first develop on iSSD simulator and present the potential of iSSD in data mining through the iSSD simulator. Our iSSD simulator performs on top of the gem 5 simulator and fully simulates all the processes of data mining algorithms running in iSSD with cycle-level accuracy. Then, we further addresse how to exploit all the computing resources for efficient processing of data mining algorithms. These days, CPU, GPU, and SSD are recently equipped together in most computing environment. If SSD is replaced with iSSD later on, we have a new computing environment where the three computing resources collaborate one another to process big data quite effectively. For this, scheduling is required to decide which computing resource is going to run for which function at which time. In our heterogeneous scheduling, types of computing resources, memory sizes in computing resources, and inter-processor communication times including IO time in SSD are considered. Our scheduling results show that processing in the collaborative environment outperforms that in the traditional one by up to about 10 times.
引用
收藏
页码:1155 / 1166
页数:12
相关论文
共 50 条
  • [1] High-performance data mining with intelligent SSD
    Yong-Yeon Jo
    Sang-Wook Kim
    Sung-Woo Cho
    Duck-Ho Bae
    Hyunok Oh
    [J]. Cluster Computing, 2017, 20 : 1155 - 1166
  • [2] Intelligent SSD: A Turbo for Big Data Mining
    Bae, Duck-Ho
    Kim, Jin-Hyung
    Kim, Sang-Wook
    Oh, Hyunok
    Park, Chanik
    [J]. PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1573 - 1576
  • [3] Intelligent SSD: A turbo for big data mining
    Bae, Duck-Ho
    Kim, Jin-Hyung
    Jo, Yong-Yeon
    Kim, Sang-Wook
    Oh, Hyunok
    Park, Chanik
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2016, 13 (02) : 375 - 394
  • [4] High-performance data mining
    Baker, Jack
    Rollins, John
    [J]. IBM Data Management Magazine, 2009, (03):
  • [5] High-performance data mining system
    Yaginuma, Y
    [J]. FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 2000, 36 (02): : 201 - 210
  • [6] Data Mining in Intelligent SSD: Simulation-based Evaluation
    Jo, Yong-Yeon
    Chung, Moonjun
    Kim, Sang-Wook
    Oh, Hyunok
    [J]. 2016 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2016, : 123 - 128
  • [7] HETEROGENEOUS HIGH PERFORMANCE DATA MINING SYSTEM FOR INTELLIGENT DATA
    WANG, XINKE
    LI, KAI
    LI, XIAOLING
    [J]. Scalable Computing, 2024, 25 (04): : 2636 - 2644
  • [8] A data mining toolset for distributed high-performance platforms
    Cannataro, M
    Congiusta, A
    Talia, D
    Trunfio, P
    [J]. DATA MINING III, 2002, 6 : 41 - 50
  • [9] Models and algorithms for high-performance distributed data mining
    Cuzzocrea, Alfredo
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (03) : 281 - 283
  • [10] Scalable, high-performance data mining with parallel processing
    Freitas, AA
    [J]. PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 1510 : 477 - 477