Distributed data mining on grids: Services, tools, and applications

被引:65
|
作者
Cannataro, M [1 ]
Congiusta, A
Pugliese, A
Talia, D
Trunfio, P
机构
[1] Univ Catanzaro, I-88100 Catanzaro, Italy
[2] Univ Calabria, DEIS, I-87036 Arcavacata Di Rende, CS, Italy
关键词
grid computing; grid programming; grid scheduling; knowledge grid; data mining;
D O I
10.1109/TSMCB.2004.836890
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data mining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science, and commerce fields often need to analyze very large datasets maintained over geographically distributed sites by using the computational power of distributed and parallel systems. The grid can play a significant role in providing an effective computational support for distributed knowledge discovery applications. For the development of data mining applications on grids we designed a system called KNOWLEDGE GRID. This paper describes the KNOWLEDGE GRID framework and presents the toolset provided by the KNOWLEDGE GRID for implementing distributed knowledge discovery. The paper discusses how to design and implement data mining applications by using the KNOWLEDGE GRID tools starting from searching grid resources, composing software and data components, and executing the resulting data mining process on a grid. Some performance results are also discussed.
引用
收藏
页码:2451 / 2465
页数:15
相关论文
共 50 条
  • [21] Task scheduling in Distributed Data Mining for medical applications
    Gantenbein, RE
    Sung, CO
    [J]. COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2003, : 250 - 253
  • [22] Lightweight clustering technique for distributed data mining applications
    Aouad, Lamine M.
    Le-Khac, Nhien-An
    Kechadi, Tahar M.
    [J]. ADVANCES IN DATA MINING: THEORETICAL ASPECTS AND APPLICATIONS, PROCEEDINGS, 2007, 4597 : 120 - +
  • [23] Shared state for distributed interactive data mining applications
    Parthasarathy, S
    Dwarkadas, S
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2002, 11 (02) : 129 - 155
  • [24] A Failure Handling Framework for Distributed Data Mining Services on the Grid
    Cesario, Eugenio
    Talia, Domenico
    [J]. PROCEEDINGS OF THE 19TH INTERNATIONAL EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING, 2011, : 70 - 79
  • [25] High Efficient Scheduler for Distributed Data Mining Applications
    Liu, Meiqun
    Gao, Kun
    [J]. CEA'09: PROCEEDINGS OF THE 3RD WSEAS INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS, 2009, : 87 - +
  • [26] Toward autonomic distributed data mining with intelligent web services
    Kantardzic, M
    Kumar, A
    [J]. IKE'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2003, : 544 - 550
  • [27] Data mining with sparse grids
    Garcke, J
    Griebel, M
    Thess, M
    [J]. COMPUTING, 2001, 67 (03) : 225 - 253
  • [28] Data Mining with Sparse Grids
    J. Garcke
    M. Griebel
    M. Thess
    [J]. Computing, 2001, 67 : 225 - 253
  • [29] Data Mining: Applications, tools, learning types and other subtopics
    Carvalho, Deborah Ribeiro
    Dallagassa, Marcelo Rosano
    [J]. ATOZ-NOVAS PRATICAS EM INFORMACAO E CONHECIMENTO, 2014, 3 (02): : 82 - 86
  • [30] Integrating and mining distributed environmental archives on Grids
    Zhizhin, M.
    Kihn, E.
    Redmon, R.
    Poyda, A.
    Mishin, D.
    Medvedev, D.
    Lyutsarev, V.
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2007, 19 (16): : 2157 - 2170