Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis

被引:112
|
作者
Chen, Jun [1 ]
Bushman, Frederic D. [2 ]
Lewis, James D. [3 ]
Wu, Gary D. [3 ]
Li, Hongzhe [1 ]
机构
[1] Univ Penn, Dept Biostat & Epidemiol, Perelman Sch Med, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Microbiol, Perelman Sch Med, Philadelphia, PA 19104 USA
[3] Univ Penn, Div Gastroenterol, Perelman Sch Med, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
Dimension reduction; Graph; Phylogenetic tree; Regularization; Variable selection; VARIABLE SELECTION; SEQUENCES; MATRIX;
D O I
10.1093/biostatistics/kxs038
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivated by studying the association between nutrient intake and human gut microbiome composition, we developed a method for structure-constrained sparse canonical correlation analysis (ssCCA) in a high-dimensional setting. ssCCA takes into account the phylogenetic relationships among bacteria, which provides important prior knowledge on evolutionary relationships among bacterial taxa. Our ssCCA formulation utilizes a phylogenetic structure-constrained penalty function to impose certain smoothness on the linear coefficients according to the phylogenetic relationships among the taxa. An efficient coordinate descent algorithm is developed for optimization. A human gut microbiome data set is used to illustrate this method. Both simulations and real data applications show that ssCCA performs better than the standard sparse CCA in identifying meaningful variables when there are structures in the data.
引用
收藏
页码:244 / 258
页数:15
相关论文
共 50 条
  • [1] Sparse Canonical Correlation Analysis with Application to Genomic Data Integration
    Parkhomenko, Elena
    Tritchler, David
    Beyene, Joseph
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2009, 8 (01)
  • [2] Sparse canonical correlation analysis
    David R. Hardoon
    John Shawe-Taylor
    [J]. Machine Learning, 2011, 83 : 331 - 353
  • [3] Sparse canonical correlation analysis
    Hardoon, David R.
    Shawe-Taylor, John
    [J]. MACHINE LEARNING, 2011, 83 (03) : 331 - 353
  • [4] Group sparse canonical correlation analysis for genomic data integration
    Lin, Dongdong
    Zhang, Jigang
    Li, Jingyao
    Calhoun, Vince D.
    Deng, Hong-Wen
    Wang, Yu-Ping
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [5] Distributed Sparse Canonical Correlation Analysis in Clustering Sensor Data
    Chen, Jia
    Schizas, Ioannis D.
    [J]. 2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 639 - 643
  • [6] Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data
    Witten, Daniela M.
    Tibshirani, Robert J.
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2009, 8 (01)
  • [7] Group sparse canonical correlation analysis for genomic data integration
    Dongdong Lin
    Jigang Zhang
    Jingyao Li
    Vince D Calhoun
    Hong-Wen Deng
    Yu-Ping Wang
    [J]. BMC Bioinformatics, 14
  • [8] Sparse semiparametric canonical correlation analysis for data of mixed types
    Yoon, Grace
    Carroll, Raymond J.
    Gaynanova, Irina
    [J]. BIOMETRIKA, 2020, 107 (03) : 609 - 625
  • [9] Efficient and Fast Joint Sparse Constrained Canonical Correlation Analysis for Fault Detection
    Xiu, Xianchao
    Pan, Lili
    Yang, Ying
    Liu, Wanquan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 4153 - 4163
  • [10] Robust sparse canonical correlation analysis
    Wilms, Ines
    Croux, Christophe
    [J]. BMC SYSTEMS BIOLOGY, 2016, 10