A unified probabilistic framework for clustering correlated heterogeneous web objects

被引:0
|
作者
Liu, GW [1 ]
Zhu, WB [1 ]
Yu, Y [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200030, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most existing algorithms cluster highly correlated data objects (e.g. web pages and web queries) separately. Some other algorithms, however, do take into account the relationship between data objects, but they either integrate content and link features into a unified feature space or apply a hard clustering algorithm, making it difficult to fully utilize the correlated information over the heterogeneous Web objects. In this paper, we propose a novel unified probabilistic framework for iteratively clustering correlated heterogeneous data objects until it converges. Our approach introduces two latent clustering layers, which serve as two mixture probabilistic models of the features. In each clustering iteration we use EM (Expectation-Maximization) algorithm to estimate the parameters of the mixture model in one latent layer and propagate them to the other one. The experimental results show that our approach effectively combines the content and link features and improves the performance of the clustering.
引用
收藏
页码:76 / 87
页数:12
相关论文
共 50 条
  • [31] A Unified Framework for Document Clustering with Dual Supervision
    Hu, Yeming
    Milios, Evangelos E.
    Blustein, James
    APPLIED COMPUTING REVIEW, 2012, 12 (02): : 53 - 63
  • [32] A Unified Framework for Privacy Preserving Data Clustering
    Li, Wenye
    NEURAL INFORMATION PROCESSING (ICONIP 2014), PT I, 2014, 8834 : 319 - 326
  • [33] A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
    Wang, Xiaoyang
    Ji, Qiang
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2120 - 2127
  • [34] Clustering and portfolio selection problems: A unified framework
    Puerto, Justo
    Rodriguez-Madrena, Moises
    Scozzari, Andrea
    COMPUTERS & OPERATIONS RESEARCH, 2020, 117 (117)
  • [35] A unified framework for privacy preserving data clustering
    Li, Wenye
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8834 : 319 - 326
  • [36] A unified view of Probabilistic PCA and regularized linear fuzzy clustering
    Mori, Y
    Honda, K
    Kanda, A
    Ichihashi, H
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 541 - 546
  • [37] A framework for dynamic topic clustering on the web
    Dichev, C
    Dicheva, D
    Radenski, A
    IC'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTERNET COMPUTING, VOLS I AND II, 2001, : 983 - 989
  • [38] A Framework of Rough Clustering for Web Transactions
    Yanto, Iwan Tri Riyadi
    Herawan, Tutut
    Deris, Mustafa Mat
    ADVANCES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, 2010, 283 : 265 - 277
  • [39] Document Clustering and Topic Modeling: A Unified Bayesian Probabilistic Perspective
    Costa, Gianni
    Ortale, Riccardo
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 278 - 285
  • [40] A unified probabilistic framework for facial activity modeling and understanding
    Tong, Yan
    Liao, Wenhui
    Xue, Zheng
    Ji, Qiang
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 2363 - +