Models of distributed data clustering in peer-to-peer environments

被引:0
|
作者
Khaled M. Hammouda
Mohamed S. Kamel
机构
[1] Desire2Learn Inc.,Department of Electrical and Computer Engineering, PAMI Group
[2] University of Waterloo,undefined
来源
关键词
Distributed data clustering; Peer-to-peer data mining;
D O I
暂无
中图分类号
学科分类号
摘要
Distributed data mining applies techniques to mine distributed data sources by avoiding the need to first collect the data into a central site. This has a significant appeal when issues of communication cost and privacy put a restriction on traditional centralized methods. Although there has been development on many fronts in distributed data mining, we are still lacking models that abstract the process by showing similarities and contrasts between the different methods. In this paper, we introduce two abstract models for distributed clustering in peer-to-peer environments with different goals. The first is the Locally optimized Distributed Clustering (LDC) model, which aims toward achieving better local clusters at each node, and is facilitated by collaboration through sharing of summarized cluster information. The second is the Globally optimized Distributed Clustering (GDC) model, which aims toward achieving one global clustering solution that is an approximation of centralized clustering. We also report on concrete realizations of the two models that show their benefits, through application in text mining. The LDC model is realized through the Collaborative P2P Clustering algorithm, while the GDC model is realized through the Hierarchically distributed P2P Clustering algorithm. In the former, we show that peer collaboration results in significant increase in local clustering quality. The process utilizes cluster summarization to exchange information between peers. In the latter, we target scalability by structuring the P2P network hierarchically and devise a distributed variant of the k-means algorithm to compute one set of clusters across the hierarchy. We demonstrate through experimental results the effectiveness of both methods and make recommendation on when to use each method.
引用
收藏
页码:303 / 329
页数:26
相关论文
共 50 条
  • [21] A comparative study of awareness methods for peer-to-peer distributed virtual environments
    Rueda, S.
    Morillo, P.
    Orduna, J. M.
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2008, 19 (05) : 537 - 552
  • [22] Decentralized Resource Discovery Mechanisms for Distributed Computing in Peer-to-Peer Environments
    Lazaro, Daniel
    Manuel Marques, Joan
    Jorba, Josep
    Vilajosana, Xavier
    ACM COMPUTING SURVEYS, 2013, 45 (04)
  • [23] Inference attacks in peer-to-peer homogeneous distributed data mining
    da Silva, JC
    Klusch, M
    Lodi, S
    Moro, G
    ECAI 2004: 16TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 110 : 450 - 454
  • [24] Distributed proximity-aware peer Clustering in BitTorrent-like peer-to-peer networks
    Xiao, Bin
    Yu, Jiadi
    Shao, Zili
    Li, Minglu
    EMBEDDED AND UBIQUITOUS COMPUTING, PROCEEDINGS, 2006, 4096 : 375 - 384
  • [25] An efficient hybrid peer-to-peer system for distributed data sharing
    Yang, Min
    Yang, Yaanyuan
    2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 1404 - 1413
  • [26] An Efficient Hybrid Peer-to-Peer System for Distributed Data Sharing
    Yang, Min
    Yang, Yuanyuan
    IEEE TRANSACTIONS ON COMPUTERS, 2010, 59 (09) : 1158 - 1171
  • [27] Approximate Distributed K-Means Clustering over a Peer-to-Peer Network
    Datta, Souptik
    Giannella, Chris R.
    Kargupta, Hillol
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (10) : 1372 - 1388
  • [28] A Social Network Peer-to-Peer Model for Peer Clustering
    Modarresi, Amir
    Mamat, Ali
    Ibrahim, Hamidah
    Mustapha, Norwati
    INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 1572 - 1578
  • [29] Peer-to-peer streaming in heterogeneous environments
    Meier, Remo
    Wattenhofer, Roger
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2012, 27 (05) : 457 - 469
  • [30] Peer-to-Peer Distributed Computing Framework
    Dharmapala, Prashan
    Koneshvaran, Lumeshkantha
    Sivasooriyathevan, Darshanun
    Ismail, Imtizam
    Kasthurirathna, Dharshana
    PROCEEDINGS OF THE 2017 6TH NATIONAL CONFERENCE ON TECHNOLOGY & MANAGEMENT (NCTM) - EXCEL IN RESEARCH AND BUILD THE NATION, 2017, : 126 - 131