Regularized bi-directional co-clustering

被引:0
|
作者
Séverine Affeldt
Lazhar Labiod
Mohamed Nadif
机构
[1] Université de Paris,
[2] CNRS,undefined
[3] Centre Borelli,undefined
来源
Statistics and Computing | 2021年 / 31卷
关键词
Co-clustering; Regularization; Information retrieval; Text mining;
D O I
暂无
中图分类号
学科分类号
摘要
The simultaneous clustering of documents and words, known as co-clustering, has proved to be more effective than one-sided clustering in dealing with sparse high-dimensional datasets. By their nature, text data are also generally unbalanced and directional. Recently, the von Mises–Fisher (vMF) mixture model was proposed to handle unbalanced data while harnessing the directional nature of text. In this paper, we propose a general co-clustering framework based on a matrix formulation of vMF model-based co-clustering. This formulation leads to a flexible framework for text co-clustering that can easily incorporate both word–word semantic relationships and document–document similarities. By contrast with existing methods, which generally use an additive incorporation of similarities, we propose a bi-directional multiplicative regularization that better encapsulates the underlying text data structure. Extensive evaluations on various real-world text datasets demonstrate the superior performance of our proposed approach over baseline and competitive methods, both in terms of clustering results and co-cluster topic coherence.
引用
收藏
相关论文
共 50 条
  • [1] Regularized bi-directional co-clustering
    Affeldt, Severine
    Labiod, Lazhar
    Nadif, Mohamed
    [J]. STATISTICS AND COMPUTING, 2021, 31 (03)
  • [2] Directional co-clustering
    Aghiles Salah
    Mohamed Nadif
    [J]. Advances in Data Analysis and Classification, 2019, 13 : 591 - 620
  • [3] Directional co-clustering
    Salah, Aghiles
    Nadif, Mohamed
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (03) : 591 - 620
  • [4] Tucker-Regularized Tensor Bregman Co-clustering
    Forero, Pedro A.
    Baxley, Paul A.
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1497 - 1501
  • [5] SPLITTING METHODS FOR CONVEX BI-CLUSTERING AND CO-CLUSTERING
    Weylandt, Michael
    [J]. 2019 IEEE DATA SCIENCE WORKSHOP (DSW), 2019, : 237 - 242
  • [6] Regularized Dual-PPMI Co-clustering for Text Data
    Affeldt, Severine
    Labiod, Lazhar
    Nadif, Mohamed
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2263 - 2267
  • [7] Learning a bi-directional discriminative representation for deep clustering
    Wang, Yiming
    Chang, Dongxia
    Fu, Zhiqiang
    Zhao, Yao
    [J]. PATTERN RECOGNITION, 2023, 137
  • [8] Co-clustering directed graphs to discover asymmetries and directional communities
    Rohe, Karl
    Qin, Tai
    Yu, Bin
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (45) : 12679 - 12684
  • [9] Generalized Co-clustering Analysis via Regularized Alternating Least Squares
    Li, Gen
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 150
  • [10] Sparse dual graph-regularized NMF for image co-clustering
    Sun, Jing
    Wang, Zhihui
    Sun, Fuming
    Li, Haojie
    [J]. NEUROCOMPUTING, 2018, 316 : 156 - 165