Multiview Regularized Discriminant Canonical Correlation Analysis: Sequential Extraction of Relevant Features From Multiblock Data

被引:3
|
作者
Mandal, Ankita [1 ]
Maji, Pradipta [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Biomed Imaging & Bioinformat Lab, Kolkata 700108, India
关键词
Feature extraction; Correlation; Covariance matrices; Data mining; Data analysis; Optimization; Statistical analysis; Canonical correlation analysis (CCA); feature extraction; multimodal data analysis; ridge regression optimization; SETS;
D O I
10.1109/TCYB.2022.3155875
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the important issues associated with real-life high-dimensional data analysis is how to extract significant and relevant features from multiview data. The multiset canonical correlation analysis (MCCA) is a well-known statistical method for multiview data integration. It finds a linear subspace that maximizes the correlations among different views. However, the existing methods to find the multiset canonical variables are computationally very expensive, which restricts the application of the MCCA in real-life big data analysis. The covariance matrix of each high-dimensional view may also suffer from the singularity problem due to the limited number of samples. Moreover, the MCCA-based existing feature extraction algorithms are, in general, unsupervised in nature. In this regard, a new supervised feature extraction algorithm is proposed, which integrates multimodal multidimensional data sets by solving maximal correlation problem of the MCCA. A new block matrix representation is introduced to reduce the computational complexity for computing the canonical variables of the MCCA. The analytical formulation enables efficient computation of the multiset canonical variables under supervised ridge regression optimization technique. It deals with the ``curse of dimensionality'' problem associated with high-dimensional data and facilitates the sequential generation of relevant features with significantly lower computational cost. The effectiveness of the proposed multiblock data integration algorithm, along with a comparison with other existing methods, is demonstrated on several benchmark and real-life cancer data.
引用
收藏
页码:5497 / 5509
页数:13
相关论文
共 44 条
  • [1] Regularized generalized canonical correlation analysis for multiblock or multigroup data analysis
    Tenenhaus, Arthur
    Tenenhaus, Michel
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2014, 238 (02) : 391 - 403
  • [2] Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods
    Michel Tenenhaus
    Arthur Tenenhaus
    Patrick J. F. Groenen
    Psychometrika, 2017, 82 : 737 - 777
  • [3] Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods
    Tenenhaus, Michel
    Tenenhaus, Arthur
    Groenen, Patrick J. F.
    PSYCHOMETRIKA, 2017, 82 (03) : 737 - 777
  • [4] Canonical Correlation Analysis for Multiview Semisupervised Feature Extraction
    Kursun, Olcay
    Alpaydin, Ethem
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I, 2010, 6113 : 430 - +
  • [5] Regularized canonical correlation analysis with unlabeled data
    Zhou, Xi-chuan
    Shen, Hai-bin
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2009, 10 (04): : 504 - 511
  • [6] Regularized canonical correlation analysis with unlabeled data
    Xi-chuan Zhou
    Hai-bin Shen
    Journal of Zhejiang University-SCIENCE A, 2009, 10 : 504 - 511
  • [8] Multiview Gait Recognition Based on Patch Distribution Features and Uncorrelated Multilinear Sparse Local Discriminant Canonical Correlation Analysis
    Hu, Haifeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2014, 24 (04) : 617 - 630
  • [9] Efficient and Distributed Generalized Canonical Correlation Analysis for Big Multiview Data
    Fu, Xiao
    Huang, Kejun
    Papalexakis, Evangelos E.
    Song, Hyun Ah
    Talukdar, Partha
    Sidiropoulos, Nicholas D.
    Faloutsos, Christos
    Mitchell, Tom
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (12) : 2304 - 2318
  • [10] Sparse additive discriminant canonical correlation analysis for multiple features fusion
    Wang, Zhan
    Wang, Lizhi
    Huang, Hua
    NEUROCOMPUTING, 2021, 463 : 185 - 197