Programming knowledge discovery workflows in service-oriented distributed systems

被引:8
|
作者
Cesario, Eugenio [1 ]
Lackovic, Marco [2 ]
Talia, Domenico [1 ,2 ]
Trunfio, Paolo [2 ]
机构
[1] ICAR CNR, Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, DEIS, I-87036 Arcavacata Di Rende, CS, Italy
来源
关键词
distributed data mining; workflows; Grid computing; Knowledge Grid;
D O I
10.1002/cpe.2936
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In several scientific and business domains, very large data repositories are generated. To find interesting and useful information in those repositories, efficient data mining techniques and knowledge discovery processes must be used. The exploitation of data mining techniques in science helps scientists in hypothesis formation and gives them a support on their scientific practices, whereas in industrial processes, data mining can exploit existing data sources as a real value for companies that can take advantage from the knowledge that can be extracted from their large data sources. Data mining tasks are often composed by multiple stages that may be linked to each other to form various execution flows. Moreover, data mining tasks are often distributed because they involve data and tools located over geographically distributed environments. Therefore, it is fundamental to exploit effective paradigms, such as services and workflows, to model data mining tasks that are both multi-staged and distributed. This paper discusses data mining services and workflows for analyzing scientific data in high-performance distributed environments such as Grids and Clouds. We discuss how it is possible to define basic and complex services for supporting distributed data mining tasks in Grids. We also present a workflow formalism and a service-oriented programming framework, named DIS3GNO, for designing and running distributed knowledge discovery processes in the Knowledge Grid system. DIS3GNO supports all the phases of a knowledge discovery process, including composition, execution, and results visualization. After introducing DIS3GNO, some relevant use cases implemented by it and a performance evaluation of the system are discussed. Copyright (C) 2012 John Wiley & Sons, Ltd.
引用
收藏
页码:1482 / 1504
页数:23
相关论文
共 50 条
  • [1] Resource planning for distributed service-oriented workflows
    Eckert, Julian
    DCSOFT 2008: PROCEEDINGS OF THE DOCTORAL CONSORTIUM ON SOFTWARE AND DATA TECHNOLOGIES, 2008, : 38 - 45
  • [2] A Service-Oriented Programming Approach for Dynamic Distributed Manufacturing Systems
    Atmojo, Udayanto Dwi
    Salcic, Zoran
    Wang, Kevin I-Kai
    Vyatkin, Valeriy
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (01) : 151 - 160
  • [3] Workflow Construction for Service-Oriented Knowledge Discovery
    Podpecan, Vid
    Zakova, Monika
    Lavrac, Nada
    LEVERAGING APPLICATIONS OF FORMAL METHODS, VERIFICATION, AND VALIDATION, PT I, 2010, 6415 : 313 - +
  • [4] Testing of Distributed Service-Oriented Systems
    Nizamic, Faris
    SERVICE-ORIENTED COMPUTING - ICSOC 2013 WORKSHOPS, 2014, 8377 : 551 - 556
  • [5] Knowledge management in service-oriented systems
    Sasa, Ana
    Krisper, Marjan
    INFORMATION MODELLING AND KNOWLEDGE BASES XXI, 2010, 206 : 89 - 104
  • [6] A Novel Framework for Defining and Submitting Workflows to Service-Oriented Systems
    Bendoukha, Hayat
    Slimani, Yahya
    Benyettou, Abdelkader
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2014, 10 (03): : 365 - 383
  • [7] Service-oriented device ecology workflows
    Loke, SW
    SERVICE-ORIENTED COMPUTING - ICSOC 2003, 2003, 2910 : 559 - 574
  • [8] @neuLink: A Service-oriented Application for Biomedical Knowledge Discovery
    Friedrich, Christoph M.
    Dach, Holger
    Gattermayer, Tobias
    Engelbrecht, Gerhard
    Benkner, Siegfried
    Hofmann-Apitius, Martin
    GLOBAL HEALTHGRID: E-SCIENCE MEETS BIOMEDICAL INFORMATICS, 2008, 138 : 165 - 172
  • [9] Aiding the Realization of Service-oriented Distributed Systems
    Autili, Marco
    Di Salle, Amleto
    Gallo, Francesco
    Pompilio, Claudio
    Tivoli, Massimo
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 1701 - 1710
  • [10] Simulation of service-oriented and distributed manufacturing systems
    Nylund, Hasse
    Andersson, Paul H.
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2010, 26 (06) : 622 - 628