Scalable Nonparametric Multiway Data Analysis

被引:0
|
作者
Zhe, Shandian [1 ]
Xu, Zenglin [2 ]
Chu, Xinqi [3 ]
Qi, Yuan [1 ]
Park, Youngja [4 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Univ Elect Sci & Technol China, Big Data Res Ctr, Sch Comp Sci & Engn, Chengdu, Peoples R China
[3] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL USA
[4] IBM Thomas J Watson Res Ctr, Ossining, NY USA
关键词
MIXTURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiway data analysis deals with multiway arrays, i.e., tensors, and the goal is twofold: predicting missing entries by modeling the interactions between array elements and discovering hidden patterns, such as clusters or communities in each mode. Despite the success of existing tensor factorization approaches, they are either unable to capture nonlinear interactions, or computationally expensive to handle massive data. In addition, most of the existing methods lack a principled way to discover latent clusters, which is important for better understanding of the data. To address these issues, we propose a scalable nonparametric tensor decomposition model. It employs Dirichlet process mixture (DPM) prior to model the latent clusters; it uses local Gaussian processes (GPs) to capture nonlinear relationships and to improve scalability. An efficient online variational Bayes Expectation-Maximization algorithm is proposed to learn the model. Experiments on both synthetic and real-world data show that the proposed model is able to discover latent clusters with higher prediction accuracy than competitive methods. Furthermore, the proposed model obtains significantly better predictive performance than the state-of-the-art large scale tensor decomposition algorithm, GigaTensor, on two large datasets with billions of entries.
引用
收藏
页码:1125 / 1134
页数:10
相关论文
共 50 条
  • [1] Bayesian Nonparametric Models for Multiway Data Analysis
    Xu, Zenglin
    Yan, Feng
    Qi, Yuan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (02) : 475 - 487
  • [2] Scalable Bayesian Tensor Ring Factorization for Multiway Data Analysis
    Tao, Zerui
    Tanaka, Toshihisa
    Zhao, Qibin
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT I, 2024, 14447 : 490 - 503
  • [3] Bayesian nonparametric multiway regression for clustered binomial data
    Lock, Eric F.
    Bandyopadhyay, Dipankar
    STAT, 2021, 10 (01):
  • [4] Scalable Nonparametric Tensor Analysis
    Zhe, Shandian
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 5058 - 5059
  • [5] AN INTRODUCTION TO MULTIWAY DATA AND THEIR ANALYSIS
    COPPI, R
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1994, 18 (01) : 3 - 13
  • [6] Applied Multiway Data Analysis
    Hewson, Paul
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2009, 172 : 941 - 942
  • [7] Discriminant Analysis for Multiway Data
    Lechuga, Gisela
    Le Brusquet, Laurent
    Perlbarg, Vincent
    Puybasset, Louis
    Galanaud, Damien
    Tenenhaus, Arthur
    MULTIPLE FACETS OF PARTIAL LEAST SQUARES AND RELATED METHODS, 2016, 173 : 115 - 126
  • [8] A scalable nonparametric specification testing for massive data
    Zhao, Yanyan
    Zou, Changliang
    Wang, Zhaojun
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2019, 200 : 161 - 175
  • [9] Comparability problems in the analysis of multiway data
    Van Mechelen, Iven
    Smilde, Age K.
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2011, 106 (01) : 2 - 11
  • [10] Applied Multiway Data Analysis.
    Hong, Sungjin
    PSYCHOMETRIKA, 2009, 74 (01) : 179 - 180