FastStep: Scalable Boolean Matrix Decomposition

被引:4
|
作者
Araujo, Miguel [1 ,2 ]
Ribeiro, Pedro [1 ]
Faloutsos, Christos [2 ]
机构
[1] Univ Porto, Cracs INESC TEC, Oporto, Portugal
[2] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
关键词
D O I
10.1007/978-3-319-31753-3_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Matrix Decomposition methods are applied to a wide range of tasks, such as data denoising, dimensionality reduction, co-clustering and community detection. However, in the presence of boolean inputs, common methods either do not scale or do not provide a boolean reconstruction, which results in high reconstruction error and low interpretability of the decomposition. We propose a novel step decomposition of boolean matrices in non-negative factors with boolean reconstruction. By formulating the problem using threshold operators and through suitable relaxation of this problem, we provide a scalable algorithm that can be applied to boolean matrices with millions of non-zero entries. We show that our method achieves significantly lower reconstruction error when compared to standard state of the art algorithms. We also show that the decomposition keeps its interpretability by analyzing communities in a flights dataset (where the matrix is interpreted as a graph in which nodes are airports) and in a movie-ratings dataset with 10 million non-zeros.
引用
收藏
页码:461 / 473
页数:13
相关论文
共 50 条