Knowledge discovery by probabilistic clustering of distributed databases

被引:12
|
作者
McClean, S [1 ]
Scotney, B [1 ]
Morrow, P [1 ]
Greer, K [1 ]
机构
[1] Univ Ulster, Sch Comp & Informat Engn, Coleraine BT52 1SA, Londonderry, North Ireland
关键词
distributed databases; probabilistic clustering; aggregates; dynamic shared ontology;
D O I
10.1016/j.datak.2004.12.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering of distributed databases facilitates knowledge discovery through learning of new concepts that characterise common features and differences between datasets. Hence, general patterns can be learned rather than restricting learning to specific databases from which rules may not be generalisable. We cluster databases that hold aggregate count data on categorical attributes that have been classified according to homogeneous or heterogeneous classification schemes. Clustering of datasets is carried out via the probability distributions that describe their respective aggregates. The homogeneous case is straightforward. For heterogeneous data we investigate a number of clustering strategies, of which the most efficient avoid the need to compute a dynamic shared ontology to homogenise the classification schemes prior to clustering. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:189 / 210
页数:22
相关论文
共 50 条
  • [1] Clustering classifiers for knowledge discovery from physically distributed databases
    Tsoumakas, G
    Angelis, L
    Vlahavas, L
    [J]. DATA & KNOWLEDGE ENGINEERING, 2004, 49 (03) : 223 - 242
  • [2] Probabilistic ant based clustering for distributed databases
    Chandrasekar, R.
    Vijaykumar, Vivek
    Srinivasan, T.
    [J]. 2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 529 - 536
  • [3] ADVANCES IN KNOWLEDGE DISCOVERY IN DISTRIBUTED DATABASES
    Pupezescu, Valentin
    [J]. RETHINKING EDUCATION BY LEVERAGING THE ELEARNING PILLAR OF THE DIGITAL AGENDA FOR EUROPE!, VOL. I, 2015, : 311 - 319
  • [4] An Improved Probabilistic Ant based Clustering for Distributed Databases
    Chandrasekar, R.
    Srinivasan, T.
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2701 - 2706
  • [5] Study of fuzzy clustering algorithm to knowledge discovery in databases
    Xie, Yinbao
    [J]. 2002, Shanghai Computer Society (28):
  • [6] Parallel and distributed databases, data mining and knowledge discovery
    Talia, D
    Kargupta, H
    Valduriez, P
    Camacho, R
    [J]. EURO-PAR 2005 PARALLEL PROCESSING, PROCEEDINGS, 2005, 3648 : 347 - 347
  • [7] Parallel and distributed databases, data mining and knowledge discovery
    Skillicorn, D
    Hameurlain, A
    Watson, P
    Orlando, S
    [J]. EURO-PAR 2004 PARALLEL PROCESSING, PROCEEDINGS, 2004, 3149 : 346 - 346
  • [8] Parallel and distributed databases, data mining and knowledge discovery
    Kosch, H
    Skilicorn, D
    Talia, D
    [J]. EURO-PAR 2002 PARALLEL PROCESSING, PROCEEDINGS, 2002, 2400 : 319 - 320
  • [9] Knowledge discovery in distributed databases using evidence theory
    Cai, D
    McTear, MF
    McClean, SI
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2000, 15 (08) : 745 - 761
  • [10] Optimization for Distributed Committee Machines in The Knowledge Discovery in Distributed Databases Process
    Pupezescu, Valentin
    [J]. Proceedings of the 10th International Conference on Virtual Learning, 2015, : 247 - 253