Orange4WS Environment for Service-Oriented Data Mining

被引:22
|
作者
Podpecan, Vid [1 ]
Zemenova, Monika [2 ]
Lavrac, Nada [1 ]
机构
[1] Jozef Stefan Inst, Ljubljana, Slovenia
[2] IZIP Inc, Prague, Czech Republic
来源
COMPUTER JOURNAL | 2012年 / 55卷 / 01期
关键词
data mining; knowledge discovery; knowledge discovery ontology; e-science workflows; automated planning of data mining workflows; SUBGROUP DISCOVERY; SYSTEM;
D O I
10.1093/comjnl/bxr077
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Novel data-mining tasks in e-science involve mining of distributed, highly heterogeneous data and knowledge sources. However, standard data mining platforms, such as Weka and Orange, involve only their own data mining algorithms in the process of knowledge discovery from local data sources. In contrast, next generation data mining technologies should enable processing of distributed data sources, the use of data mining algorithms implemented as web services, as well as the use of formal descriptions of data sources and knowledge discovery tools in the form of ontologies, enabling automated composition of complex knowledge discovery workflows for a given data mining task. This paper proposes a novel Service-oriented Knowledge Discovery framework and its implementation in a service-oriented data mining environment Orange4WS (Orange for Web Services), based on the existing Orange data mining toolbox and its visual programming environment, which enables manual composition of data mining workflows. The new service-oriented data mining environment Orange4WS includes the following new features: simple use of web services as remote components that can be included into a data mining workflow; simple incorporation of relational data mining algorithms; a knowledge discovery ontology to describe workflow components (data, knowledge and data mining services) in an abstract and machine-interpretable way, and its use by a planner that enables automated composition of data mining workflows. These new features are showcased in three real-world scenarios.
引用
收藏
页码:82 / 98
页数:17
相关论文
共 50 条
  • [1] The Weka4WS framework for distributed data mining in service-oriented Grids
    Talia, Domenico
    Trunfio, Paolo
    Verta, Oreste
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2008, 20 (16): : 1933 - 1951
  • [2] SegMine workflows for semantic microarray data analysis in Orange4WS
    Vid Podpečan
    Nada Lavrač
    Igor Mozetič
    Petra Kralj Novak
    Igor Trajkovski
    Laura Langohr
    Kimmo Kulovesi
    Hannu Toivonen
    Marko Petek
    Helena Motaln
    Kristina Gruden
    [J]. BMC Bioinformatics, 12
  • [3] SegMine workflows for semantic microarray data analysis in Orange4WS
    Podpecan, Vid
    Lavrac, Nada
    Mozetic, Igor
    Novak, Petra Kralj
    Trajkovski, Igor
    Langohr, Laura
    Kulovesi, Kimmo
    Toivonen, Hannu
    Petek, Marko
    Motaln, Helena
    Gruden, Kristina
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [4] Service-oriented distributed data-mining
    Cheung, William K.
    Zhang, Xiao-Feng
    Wong, Ho-Fai
    Liu, Jiming
    Luo, Zong-Wei
    Tong, Frank C. H.
    [J]. IEEE INTERNET COMPUTING, 2006, 10 (04) : 44 - 54
  • [5] Service-oriented middleware for distributed data mining on the grid
    Congiusta, Antonio
    Talia, Domenico
    Trunfio, Paolo
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2008, 68 (01) : 3 - 15
  • [6] XML Integrated Environment for Service-Oriented Data Management
    Maarouf, Marwan Y.
    Chung, Soon M.
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 2, PROCEEDINGS, 2008, : 361 - 368
  • [7] A Service-Oriented Framework for Executing Data Mining Workflows on Grids
    Lackovic, Marco
    Talia, Domenico
    Trunfio, Paolo
    [J]. 2009 4TH INTERNATIONAL CONFERENCE ON GRID AND PERVASIVE COMPUTING WORKSHOPS: (GPC WORKSHOPS), 2009, : 70 - 77
  • [8] Data mining and service rating in service-oriented architectures to improve information sharing
    Chen, Ying
    Cohen, Brad
    [J]. 2005 IEEE Aerospace Conference, Vols 1-4, 2005, : 3246 - 3256
  • [9] Research of Service-oriented Data Sharing Standard in Grid Environment
    曾怡
    李国庆
    [J]. 遥感技术与应用, 2011, (05) : 698 - 704
  • [10] An intelligent service agent in service-oriented environment
    Dong, PingJun
    Wang, XiaoFeng
    [J]. Proceedings of the Sixth International Conference on Information and Management Sciences, 2007, 6 : 19 - 22