Scalable Nonparametric Multiway Data Analysis

被引:0
|
作者
Zhe, Shandian [1 ]
Xu, Zenglin [2 ]
Chu, Xinqi [3 ]
Qi, Yuan [1 ]
Park, Youngja [4 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[2] Univ Elect Sci & Technol China, Big Data Res Ctr, Sch Comp Sci & Engn, Chengdu, Peoples R China
[3] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL USA
[4] IBM Thomas J Watson Res Ctr, Ossining, NY USA
关键词
MIXTURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiway data analysis deals with multiway arrays, i.e., tensors, and the goal is twofold: predicting missing entries by modeling the interactions between array elements and discovering hidden patterns, such as clusters or communities in each mode. Despite the success of existing tensor factorization approaches, they are either unable to capture nonlinear interactions, or computationally expensive to handle massive data. In addition, most of the existing methods lack a principled way to discover latent clusters, which is important for better understanding of the data. To address these issues, we propose a scalable nonparametric tensor decomposition model. It employs Dirichlet process mixture (DPM) prior to model the latent clusters; it uses local Gaussian processes (GPs) to capture nonlinear relationships and to improve scalability. An efficient online variational Bayes Expectation-Maximization algorithm is proposed to learn the model. Experiments on both synthetic and real-world data show that the proposed model is able to discover latent clusters with higher prediction accuracy than competitive methods. Furthermore, the proposed model obtains significantly better predictive performance than the state-of-the-art large scale tensor decomposition algorithm, GigaTensor, on two large datasets with billions of entries.
引用
收藏
页码:1125 / 1134
页数:10
相关论文
共 50 条
  • [21] Exploring dynamic metabolomics data with multiway data analysis: a simulation study
    Lu Li
    Huub Hoefsloot
    Albert A. de Graaf
    Evrim Acar
    Age K. Smilde
    BMC Bioinformatics, 23
  • [22] Mixed multiway analysis of airborne particle composition data
    Hopke, PK
    Xie, YL
    Paatero, P
    JOURNAL OF CHEMOMETRICS, 1999, 13 (3-4) : 343 - 352
  • [23] Nonparametric Bayesian data analysis
    Müller, P
    Quintana, FA
    STATISTICAL SCIENCE, 2004, 19 (01) : 95 - 110
  • [24] A new array decomposition method for multiway data analysis
    Jiang, Hongwei
    Zhang, Luoman
    Xia, Jielai
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2010, 101 (01) : 56 - 71
  • [25] Exploring dynamic metabolomics data with multiway data analysis: a simulation study
    Li, Lu
    Hoefsloot, Huub
    de Graaf, Albert A.
    Acar, Evrim
    Smilde, Age K.
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [26] A scalable Bayesian nonparametric model for large spatio-temporal data
    Barzegar, Zahra
    Rivaz, Firoozeh
    COMPUTATIONAL STATISTICS, 2020, 35 (01) : 153 - 173
  • [27] A scalable Bayesian nonparametric model for large spatio-temporal data
    Zahra Barzegar
    Firoozeh Rivaz
    Computational Statistics, 2020, 35 : 153 - 173
  • [28] Nonconvulsive Epileptic Seizures Detection Using Multiway Data Analysis
    Rodriguez Aldana, Yissel
    Hunyadi, Borbala
    Maranon Reyes, Enrique J.
    Rodriguez Rodriguez, Valia
    Van Huffel, Sabine
    2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 2344 - 2348
  • [29] MULTIWAY DATA-ANALYSIS - COPPI,R, BOLASCO,S
    GREENACRE, MJ
    JOURNAL OF CLASSIFICATION, 1992, 9 (01) : 148 - 150
  • [30] Application of Multiway Chemometric Techniques for Analysis of AC Voltammetric Data
    Jaworski, Aleksander
    Wikiel, Hanna
    Wikiel, Kazimierz
    ELECTROANALYSIS, 2009, 21 (3-5) : 580 - 589