Distributed generative data mining

被引:0
|
作者
Ramos, Ruy [1 ]
Camacho, Rui [2 ]
机构
[1] LIACC, Rua Ceuta 118-6, P-4050190 Oporto, Portugal
[2] EEUP, P-4200465 Oporto, Portugal
关键词
data mining; parallel and distributed computing; inductive logic programming;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A process of Knowledge Discovery in Databases (KDD) involving large amounts of data requires a considerable amount of computational power. The process may be done on a dedicated and expensive machinery or, for some tasks, one can use distributed computing techniques on a network of affordable machines. In either approach it is usual the user to specify the workflow of the sub-tasks composing the whole KDD process before execution starts. In this paper we propose a technique that we call Distributed Generative Data Mining. The generative feature of the technique is due to its capability of generating new sub-tasks of the Data Mining analysis process at execution time. The workflow of sub-tasks of the DM is, therefore, dynamic. To deploy the proposed technique we extended the Distributed Data Mining system HARVARD and adapted an Inductive Logic Programming system (IndLog) used in a Relational Data Ming task. As a proof-of-concept, the extended system was used to analyse an artificial dataset of a credit scoring problem with eighty million records.
引用
收藏
页码:307 / +
页数:3
相关论文
共 50 条
  • [41] Knowledge Fusion for Probabilistic Generative Classifiers with Data Mining Applications
    Fisch, Dominik
    Kalkowski, Edgar
    Sick, Bernhard
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (03) : 652 - 666
  • [42] Distributed data mining in grid computing environments
    Luo, Ping
    Lu, Kevin
    Shi, Zhongzhi
    He, Qing
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2007, 23 (01): : 84 - 91
  • [43] Distributed Data Mining Tasks and Patterns as Services
    Talia, Domenico
    [J]. EURO-PAR 2008 WORKSHOPS - PARALLEL PROCESSING, 2009, 5415 : 415 - 422
  • [44] Distributed Data Mining System for Tourism Industry
    Danubianu, M.
    Socaciu, T.
    Amariei, D.
    [J]. ELEKTRONIKA IR ELEKTROTECHNIKA, 2010, (03) : 31 - 34
  • [45] Challenges for data mining in distributed sensor networks
    Cantoni, Virginio
    Lombardi, Luca
    Lombardi, Paolo
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2006, : 1000 - +
  • [46] Special Issue on Parallel and Distributed Data Mining
    Cafaro, Massimo
    Epicoco, Italo
    Pulimeno, Marco
    [J]. INFORMATION SCIENCES, 2019, 496 : 284 - 286
  • [47] Distributed data mining in a ubiquitous healthcare framework
    Viswanathan, M.
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4509 : 261 - 271
  • [48] Distributed Data Mining by Means of SQL Enhancement
    Gorawski, Marcin
    Pluciennik, Ewa
    [J]. ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2008 WORKSHOPS, 2008, 5333 : 34 - 35
  • [49] Distributed data mining in grid computing environment
    Ren, Jianlan
    Chen, Zhongsheng
    Zhang, Zheng
    [J]. INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2020, 16 (03) : 305 - 320
  • [50] A distributed knowledge extraction data mining algorithm
    Liu, JB
    Thanneru, U
    Cheng, DZ
    [J]. COMPUTATIONAL AND INFORMATION SCIENCE, PROCEEDINGS, 2004, 3314 : 768 - 774