High-performance data mining with intelligent SSD

被引:0
|
作者
Yong-Yeon Jo
Sang-Wook Kim
Sung-Woo Cho
Duck-Ho Bae
Hyunok Oh
机构
[1] Hanyang University,Department of Computer and Software
[2] Hanyang University,Department of Information Systems
来源
Cluster Computing | 2017年 / 20卷
关键词
Intelligent SSD; Simulator-based evaluation; Collaborative processing; Heterogeneous scheduling;
D O I
暂无
中图分类号
学科分类号
摘要
An intuitive way to process the big data efficiently is to reduce the volume of data transferred over the storage interface to a host system. This is the reason that the notion of intelligent SSD (iSSD) was proposed to give processing power to SSD. There is rich literature on iSSD, however, its real implementation has not been provided to the public yet. Most prior work aims to quantify the benefits of iSSD with analytical modeling. In this paper, we first develop on iSSD simulator and present the potential of iSSD in data mining through the iSSD simulator. Our iSSD simulator performs on top of the gem 5 simulator and fully simulates all the processes of data mining algorithms running in iSSD with cycle-level accuracy. Then, we further addresse how to exploit all the computing resources for efficient processing of data mining algorithms. These days, CPU, GPU, and SSD are recently equipped together in most computing environment. If SSD is replaced with iSSD later on, we have a new computing environment where the three computing resources collaborate one another to process big data quite effectively. For this, scheduling is required to decide which computing resource is going to run for which function at which time. In our heterogeneous scheduling, types of computing resources, memory sizes in computing resources, and inter-processor communication times including IO time in SSD are considered. Our scheduling results show that processing in the collaborative environment outperforms that in the traditional one by up to about 10 times.
引用
收藏
页码:1155 / 1166
页数:11
相关论文
共 50 条
  • [1] High-performance data mining with intelligent SSD
    Jo, Yong-Yeon
    Kim, Sang-Wook
    Cho, Sung-Woo
    Bae, Duck-Ho
    Oh, Hyunok
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2017, 20 (02): : 1155 - 1166
  • [2] Intelligent SSD: A Turbo for Big Data Mining
    Bae, Duck-Ho
    Kim, Jin-Hyung
    Kim, Sang-Wook
    Oh, Hyunok
    Park, Chanik
    [J]. PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1573 - 1576
  • [3] Intelligent SSD: A turbo for big data mining
    Bae, Duck-Ho
    Kim, Jin-Hyung
    Jo, Yong-Yeon
    Kim, Sang-Wook
    Oh, Hyunok
    Park, Chanik
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2016, 13 (02) : 375 - 394
  • [4] High-performance data mining
    Baker, Jack
    Rollins, John
    [J]. IBM Data Management Magazine, 2009, (03):
  • [5] High-performance data mining system
    Yaginuma, Y
    [J]. FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 2000, 36 (02): : 201 - 210
  • [6] Data Mining in Intelligent SSD: Simulation-based Evaluation
    Jo, Yong-Yeon
    Chung, Moonjun
    Kim, Sang-Wook
    Oh, Hyunok
    [J]. 2016 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2016, : 123 - 128
  • [7] HETEROGENEOUS HIGH PERFORMANCE DATA MINING SYSTEM FOR INTELLIGENT DATA
    WANG, XINKE
    LI, KAI
    LI, XIAOLING
    [J]. Scalable Computing, 2024, 25 (04): : 2636 - 2644
  • [8] A data mining toolset for distributed high-performance platforms
    Cannataro, M
    Congiusta, A
    Talia, D
    Trunfio, P
    [J]. DATA MINING III, 2002, 6 : 41 - 50
  • [9] Models and algorithms for high-performance distributed data mining
    Cuzzocrea, Alfredo
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (03) : 281 - 283
  • [10] Scalable, high-performance data mining with parallel processing
    Freitas, AA
    [J]. PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 1510 : 477 - 477