Log-ratio lasso: Scalable, sparse estimation for log-ratio models

被引:18
|
作者
Bates, Stephen [1 ]
Tibshirani, Robert [1 ,2 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Biomed Data Sci, Stanford, CA 94305 USA
关键词
compositional data; lasso; log-ratio; mass spectrometry; variable selection; POST-SELECTION INFERENCE; VARIABLE SELECTION; REGRESSION;
D O I
10.1111/biom.12995
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Positive-valued signal data is common in the biological and medical sciences, due to the prevalence of mass spectrometry other imaging techniques. With such data, only the relative intensities of the raw measurements are meaningful. It is desirable to consider models consisting of the log-ratios of all pairs of the raw features, since log-ratios are the simplest meaningful derived features. In this case, however, the dimensionality of the predictor space becomes large, and computationally efficient estimation procedures are required. In this work, we introduce an embedding of the log-ratio parameter space into a space of much lower dimension and use this representation to develop an efficient penalized fitting procedure. This procedure serves as the foundation for a two-step fitting procedure that combines a convex filtering step with a second non-convex pruning step to yield highly sparse solutions. On a cancer proteomics data set, the proposed method fits a highly sparse model consisting of features of known biological relevance while greatly improving upon the predictive accuracy of less interpretable methods.
引用
收藏
页码:613 / 624
页数:12
相关论文
共 50 条
  • [41] Error Propagation in Isometric Log-ratio Coordinates for Compositional Data: Theoretical and Practical Considerations
    Mehmet Can Mert
    Peter Filzmoser
    Karel Hron
    [J]. Mathematical Geosciences, 2016, 48 : 941 - 961
  • [42] The Isometric Log-Ratio Transform for Probabilistic Multi-Label Anatomical Shape Representation
    Andrews, Shawn
    Changizi, Neda
    Hamarneh, Ghassan
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2014, 33 (09) : 1890 - 1899
  • [43] Using isometric log-ratio in compositional data analysis for developing a groundwater pollution index
    Oh, Junseop
    Kim, Kyoung-Ho
    Kim, Ho-Rim
    Park, Sunhwa
    Yun, Seong-Taek
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [44] Error Propagation in Isometric Log-ratio Coordinates for Compositional Data: Theoretical and Practical Considerations
    Mert, Mehmet Can
    Filzmoser, Peter
    Hron, Karel
    [J]. MATHEMATICAL GEOSCIENCES, 2016, 48 (08) : 941 - 961
  • [45] Comprehensive multidimensional tectonomagmatic discrimination from log-ratio transformed major and trace elements
    Verma, Surendra P.
    [J]. LITHOS, 2020, 362
  • [47] Comparison of log-ratio and log10 chemical elemental data analysis of Central Amazonian pottery and archaeological implications
    Hazenfratz, Roberto
    Mongelo, Guilherme Z.
    Munita, Casimiro S.
    Neves, Eduardo G.
    [J]. ARCHAEOLOGICAL AND ANTHROPOLOGICAL SCIENCES, 2024, 16 (05)
  • [48] Assessing the Italian tax courts system by weighted three-way log-ratio analysis
    Lombardo, Rosaria
    Camminatiello, Ida
    D'Ambra, Antonello
    Beh, Eric J.
    [J]. SOCIO-ECONOMIC PLANNING SCIENCES, 2021, 73
  • [49] Log-Ratio and Parallel Factor Analysis: An Approach to Analyze Three-Way Compositional Data
    Gallo, Michele
    [J]. ADVANCED DYNAMIC MODELING OF ECONOMIC AND SOCIAL SYSTEMS, 2013, 448 : 209 - 221
  • [50] Modified log-ratio operator for change detection of synthetic aperture radar targets in forest concealment
    Gao, Gui
    Wang, Xiaoyang
    Niu, Min
    Zhou, Shilin
    [J]. JOURNAL OF APPLIED REMOTE SENSING, 2014, 8