A unified probabilistic framework for clustering correlated heterogeneous web objects

被引:0
|
作者
Liu, GW [1 ]
Zhu, WB [1 ]
Yu, Y [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200030, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most existing algorithms cluster highly correlated data objects (e.g. web pages and web queries) separately. Some other algorithms, however, do take into account the relationship between data objects, but they either integrate content and link features into a unified feature space or apply a hard clustering algorithm, making it difficult to fully utilize the correlated information over the heterogeneous Web objects. In this paper, we propose a novel unified probabilistic framework for iteratively clustering correlated heterogeneous data objects until it converges. Our approach introduces two latent clustering layers, which serve as two mixture probabilistic models of the features. In each clustering iteration we use EM (Expectation-Maximization) algorithm to estimate the parameters of the mixture model in one latent layer and propagate them to the other one. The experimental results show that our approach effectively combines the content and link features and improves the performance of the clustering.
引用
收藏
页码:76 / 87
页数:12
相关论文
共 50 条
  • [1] A unified framework for clustering heterogeneous web objects
    Zeng, HJ
    Chen, Z
    Ma, WY
    WISE 2002: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING, 2002, : 161 - 170
  • [2] A unified probabilistic framework for web page scoring systems
    Diligenti, M
    Gori, M
    Maggini, M
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (01) : 4 - 16
  • [3] From Penalized Maximum Likelihood to cluster analysis: A unified probabilistic framework of clustering
    Sun, Xichen
    Cheng, Qiansheng
    Feng, Jufu
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2007, 21 (03) : 483 - 490
  • [4] A unified framework for heterogeneous patterns
    Catania, Barbara
    Maddalena, Anna
    INFORMATION SYSTEMS, 2012, 37 (05) : 460 - 483
  • [5] A Probabilistic Framework for Relational Clustering
    Long, Bo
    Zhang, Zhongfei
    Yu, Philip S.
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 470 - 479
  • [6] A probabilistic framework for graph clustering
    Luo, B
    Robles-Kelly, A
    Torsello, A
    Wilson, RC
    Hancock, ER
    2001 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2001, : 912 - 919
  • [7] A Unified Framework for Approximating and Clustering Data
    Feldman, Dan
    Langberg, Michael
    STOC 11: PROCEEDINGS OF THE 43RD ACM SYMPOSIUM ON THEORY OF COMPUTING, 2011, : 569 - 578
  • [8] A unified framework for approximating and clustering data
    California Institute of Technology, Pasadena, CA 91125, United States
    不详
    Proc. Annu. ACM Symp. Theory Comput., (569-578):
  • [9] UNIFIED PROBABILISTIC FRAMEWORK FOR SIMULTANEOUS DETECTION AND TRACKING OF MULTIPLE OBJECTS WITH APPLICATION TO BIO-IMAGE SEQUENCES
    Karthikeyan, S.
    Delibaltov, Diana
    Gaur, Utkarsh
    Jiang, Mei
    Williams, David
    Manjunath, B. S.
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 1349 - 1352
  • [10] A generalized Bayes framework for probabilistic clustering
    Rigon, Tommaso
    Herring, Amy H.
    Dunson, David B.
    BIOMETRIKA, 2023, 110 (03) : 559 - 578