Doubly Nonparametric Sparse Nonnegative Matrix Factorization Based on Dependent Indian Buffet Processes

被引:20
|
作者
Xuan, Junyu [1 ,2 ]
Lu, Jie [1 ,2 ]
Zhang, Guangquan [1 ,2 ]
Xu, Richard Yi Da [1 ,2 ]
Luo, Xiangfeng [2 ]
机构
[1] Univ Technol Sydney, Fac Engn & Informat Technol, Ultimo, NSW 2007, Australia
[2] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
基金
澳大利亚研究理事会; 美国国家科学基金会;
关键词
Co-clustering; nonnegative matrix factorization; probability graphical model; text mining; VECTOR;
D O I
10.1109/TNNLS.2017.2676817
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sparse nonnegative matrix factorization (SNMF) aims to factorize a data matrix into two optimized nonnegative sparse factor matrices, which could benefit many tasks, such as document-word co-clustering. However, the traditional SNMF typically assumes the number of latent factors (i.e., dimensionality of the factor matrices) to be fixed. This assumption makes it inflexible in practice. In this paper, we propose a doubly sparse nonparametric NMF framework to mitigate this issue by using dependent Indian buffet processes (dIBP). We apply a correlation function for the generation of two stick weights associated with each column pair of factor matrices while still maintaining their respective marginal distribution specified by IBP. As a consequence, the generation of two factor matrices will be columnwise correlated. Under this framework, two classes of correlation function are proposed: 1) using bivariate Beta distribution and 2) using Copula function. Compared with the single IBP-based NMF, this paper jointly makes two factor matrices nonparametric and sparse, which could be applied to broader scenarios, such as co-clustering. This paper is seen to be much more flexible than Gaussian process-based and hierarchial Beta process-based dIBPs in terms of allowing the two corresponding binary matrix columns to have greater variations in their nonzero entries. Our experiments on synthetic data show the merits of this paper compared with the state-of-the-art models in respect of factorization efficiency, sparsity, and flexibility. Experiments on real-world data sets demonstrate the efficiency of this paper in document-word co-clustering tasks.
引用
收藏
页码:1835 / 1849
页数:15
相关论文
共 50 条
  • [1] Indian Buffet Process-Based on Nonnegative Matrix Factorization with Single Binary Component
    Ma, Xindi
    Gao, Jie
    Liu, Xiaoyu
    Zhang, Taiping
    Tang, Yuan Yan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1139 - 1144
  • [2] Nonparametric Bayesian Nonnegative Matrix Factorization
    Xie, Hong-Bo
    Li, Caoyuan
    Mengersen, Kerrie
    Wang, Shuliang
    Da Xu, Richard Yi
    [J]. MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2020), 2020, 12256 : 132 - 141
  • [3] Document clustering based on nonnegative sparse matrix factorization
    Yang, CF
    Ye, M
    Zhao, J
    [J]. ADVANCES IN NATURAL COMPUTATION, PT 2, PROCEEDINGS, 2005, 3611 : 557 - 563
  • [4] DISCRIMINANT SPARSE NONNEGATIVE MATRIX FACTORIZATION
    Zhi, Ruicong
    Ruan, Qiuqi
    [J]. ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 570 - 573
  • [5] Binary Sparse Nonnegative Matrix Factorization
    Yuan, Yuan
    Li, Xuelong
    Pang, Yanwei
    Lu, Xin
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2009, 19 (05) : 772 - 777
  • [6] Sparse Deep Nonnegative Matrix Factorization
    Guo, Zhenxing
    Zhang, Shihua
    [J]. BIG DATA MINING AND ANALYTICS, 2020, 3 (01) : 13 - 28
  • [7] Extended sparse nonnegative matrix factorization
    Stadlthanner, K
    Theis, FJ
    Puntonet, CG
    Lang, EW
    [J]. COMPUTATIONAL INTELLIGENCE AND BIOINSPIRED SYSTEMS, PROCEEDINGS, 2005, 3512 : 249 - 256
  • [8] Sparse Separable Nonnegative Matrix Factorization
    Nadisic, Nicolas
    Vandaele, Arnaud
    Cohen, Jeremy E.
    Gillis, Nicolas
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT I, 2021, 12457 : 335 - 350
  • [9] Sparse Deep Nonnegative Matrix Factorization
    Zhenxing Guo
    Shihua Zhang
    [J]. Big Data Mining and Analytics, 2020, (01) : 13 - 28
  • [10] Blind Spectral Unmixing Based on Sparse Nonnegative Matrix Factorization
    Yang, Zuyuan
    Zhou, Guoxu
    Xie, Shengli
    Ding, Shuxue
    Yang, Jun-Mei
    Zhang, Jun
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2011, 20 (04) : 1112 - 1125