Regression Models for Compositional Data: General Log-Contrast Formulations, Proximal Optimization, and Microbiome Data Applications

被引:0
|
作者
Patrick L. Combettes
Christian L. Müller
机构
[1] North Carolina State University,Department of Mathematics
[2] Flatiron Institute,Center for Computational Mathematics
[3] Institute of Computational Biology,Department of Statistics
[4] Helmholtz Zentrum München,undefined
[5] Ludwig-Maxmilians-Universität München,undefined
来源
Statistics in Biosciences | 2021年 / 13卷
关键词
Compositional data; Convex optimization; Log-contrast model; Microbiome; Perspective function; Proximal algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Compositional data sets are ubiquitous in science, including geology, ecology, and microbiology. In microbiome research, compositional data primarily arise from high-throughput sequence-based profiling experiments. These data comprise microbial compositions in their natural habitat and are often paired with covariate measurements that characterize physicochemical habitat properties or the physiology of the host. Inferring parsimonious statistical associations between microbial compositions and habitat- or host-specific covariate data is an important step in exploratory data analysis. A standard statistical model linking compositional covariates to continuous outcomes is the linear log-contrast model. This model describes the response as a linear combination of log-ratios of the original compositions and has been extended to the high-dimensional setting via regularization. In this contribution, we propose a general convex optimization model for linear log-contrast regression which includes many previous proposals as special cases. We introduce a proximal algorithm that solves the resulting constrained optimization problem exactly with rigorous convergence guarantees. We illustrate the versatility of our approach by investigating the performance of several model instances on soil and gut microbiome data analysis tasks.
引用
收藏
页码:217 / 242
页数:25
相关论文
共 50 条
  • [1] Regression Models for Compositional Data: General Log-Contrast Formulations, Proximal Optimization, and Microbiome Data Applications
    Combettes, Patrick L.
    Mueller, Christian L.
    STATISTICS IN BIOSCIENCES, 2021, 13 (02) : 217 - 242
  • [2] LOG-CONTRAST REGRESSION WITH FUNCTIONAL COMPOSITIONAL PREDICTORS: LINKING PRETERM INFANTS' GUT MICROBIOME TRAJECTORIES TO NEUROBEHAVIORAL OUTCOME
    Sun, Zhe
    Xu, Wanli
    Cong, Xiaomei
    Li, Gen
    Chen, Kun
    ANNALS OF APPLIED STATISTICS, 2020, 14 (03): : 1535 - 1556
  • [3] Multivariate log-contrast regression with sub-compositional predictors: Testing the association between preterm infants' gut microbiome and neurobehavioral outcomes
    Liu, Xiaokang
    Cong, Xiaomei
    Li, Gen
    Maas, Kendra
    Chen, Kun
    STATISTICS IN MEDICINE, 2022, 41 (03) : 580 - 594
  • [4] FDR control for linear log-contrast models with high-dimensional compositional covariates
    Yuan, Panxu
    Jin, Changhan
    Li, Gaorong
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 197
  • [5] REGRESSION ANALYSIS FOR MICROBIOME COMPOSITIONAL DATA
    Shi, Pixu
    Zhang, Anru
    Li, Hongzhe
    ANNALS OF APPLIED STATISTICS, 2016, 10 (02): : 1019 - 1040
  • [6] Robust Signal Recovery for High-Dimensional Linear Log-Contrast Models with Compositional Covariates
    Han, Dongxiao
    Huang, Jian
    Lin, Yuanyuan
    Liu, Lei
    Qu, Lianqiang
    Sun, Liuquan
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2023, 41 (03) : 957 - 967
  • [7] Bayesian Graphical Compositional Regression for Microbiome Data
    Mao, Jialiang
    Chen, Yuhan
    Ma, Li
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 610 - 624
  • [8] Debiased high-dimensional regression calibration for errors-in-variables log-contrast models
    Zhao, Huali
    Wang, Tianying
    Biometrics, 2024, 80 (04)
  • [9] Data Augmentation for Compositional Data: Advancing Predictive Models of the Microbiome
    Gordon-Rodriguez, Elliott
    Quinn, Thomas P.
    Cunninghham, John P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [10] Kernel Methods for Regression Analysis of Microbiome Compositional Data
    Chen, Jun
    Li, Hongzhe
    TOPICS IN APPLIED STATISTICS, 2013, 55 : 191 - 201