Textual data summarization using the Self-Organized Co-Clustering model

被引:10
|
作者
Selosse, Margot [1 ]
Jacques, Julien [1 ]
Biernacki, Christophe [2 ,3 ]
机构
[1] Univ Lyon, Lyon & ERIC EA3083 2, 5 Ave Pierre Mendes, Bron 69500, France
[2] Univ Lille, UFR Math, Cite Sci, Villeneuve Dascq 59655, France
[3] INRIA, 40 Av Halley,Bat A,Pk Plaza, Villeneuve Dascq 59650, France
关键词
Co-Clustering; Document-term matrix; Latent block model; LATENT BLOCK MODEL; FACTORIZATION; MATRIX;
D O I
10.1016/j.patcog.2020.107315
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, different studies have demonstrated the use of co-clustering, a data mining technique which simultaneously produces row-clusters of observations and column-clusters of features. The present work introduces a novel co-clustering model to easily summarize textual data in a document-term format. In addition to highlighting homogeneous co-clusters as other existing algorithms do we also distinguish noisy co-clusters from significant co-clusters, which is particularly useful for sparse document-term matrices. Furthermore, our model proposes a structure among the significant co-clusters, thus providing improved interpretability to users. The approach proposed contends with state-of-the-art methods for document and term clustering and offers user-friendly results. The model relies on the Poisson distribution and on a constrained version of the Latent Block Model, which is a probabilistic approach for co-clustering. A Stochastic Expectation-Maximization algorithm is proposed to run the model's inference as well as a model selection criterion to choose the number of co-clusters. Both simulated and real data sets illustrate the efficiency of this model by its ability to easily identify relevant co-clusters. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Adaptive Spectral Co-clustering for Multiview Data
    Son, Jeong-Woo
    Jeon, Junekey
    Lee, Sang-Yun
    Kim, Sun-Joong
    2016 18TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATIONS TECHNOLOGY (ICACT) - INFORMATION AND COMMUNICATIONS FOR SAFE AND SECURE LIFE, 2016, : 447 - 450
  • [42] Aging in a model of self-organized criticality
    Boettcher, S
    Paczuski, M
    PHYSICAL REVIEW LETTERS, 1997, 79 (05) : 889 - 892
  • [43] Self-organized criticality in a landslide model
    Hergarten, S
    Neugebauer, HJ
    GEOPHYSICAL RESEARCH LETTERS, 1998, 25 (06) : 801 - 804
  • [44] Flocking Model for Self-Organized Swarms
    Soza Mamani, Kevin Marlon
    Diaz Palacios, Fabio Richard
    2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND COMPUTERS (CONIELECOMP), 2019,
  • [45] Self-organized model of cascade spreading
    S. Gualdi
    M. Medo
    Y.-C. Zhang
    The European Physical Journal B, 2011, 79 : 91 - 98
  • [46] A self-organized model for network evolution
    Caldarelli, G.
    Capocci, A.
    Garlaschelli, D.
    EUROPEAN PHYSICAL JOURNAL B, 2008, 64 (3-4): : 585 - 591
  • [47] Dissecting Ubiquitin Folding Using the Self-Organized Polymer Model
    Reddy, Govardhan
    Thirumalai, D.
    JOURNAL OF PHYSICAL CHEMISTRY B, 2015, 119 (34): : 11358 - 11370
  • [48] STATICS OF A SELF-ORGANIZED PERCOLATION MODEL
    HENLEY, CL
    PHYSICAL REVIEW LETTERS, 1993, 71 (17) : 2741 - 2744
  • [49] Correlated earthquakes in a self-organized model
    Baiesi, M.
    NONLINEAR PROCESSES IN GEOPHYSICS, 2009, 16 (02) : 233 - 240
  • [50] Self-organized criticality in an asexual model?
    Chisholm, C
    Jan, N
    Gibbs, P
    Erzan, A
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2000, 11 (06): : 1257 - 1262