Distributed optimization for penalized regression in massive compositional data

被引：0

作者：

Chao, Yue ^{[1
,2
]}

Huang, Lei ^{[3
]}

Ma, Xuejun ^{[1
]}

机构：

[1] Soochow Univ, Sch Math Sci, Dept Stat, Suzhou, Peoples R China

[2] Xiamen Univ, MOE Key Lab Econometr, WISE, Xiamen, Peoples R China

[3] Southwest Jiaotong Univ, Sch Math, Dept Stat, Chengdu, Peoples R China

来源：

APPLIED MATHEMATICAL MODELLING | 2025年 / 141卷

基金：

中国国家自然科学基金;

关键词：

Massive compositional data; Distributed optimization; Augmented Lagrangian; Coordinate-wise descent; Variable selection; Medical insurance; QUANTILE REGRESSION; VARIABLE SELECTION; LIKELIHOOD; ALGORITHMS;

D O I：

10.1016/j.apm.2025.115950

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Compositional data have been widely used in various fields to analyze parts of a whole, providing insights into proportional relationships. With the increasing availability of extraordinarily large compositional datasets, addressing the challenges of distributed statistical methodologies and computations has become essential in the era of big data. This paper focuses on the optimization methodology and practical application of the distributed sparse penalized linear log- contrast model for massive compositional data, specifically in the context of medical insurance reimbursement ratio prediction. We propose two distributed optimization techniques tailored for centralized and decentralized topologies to effectively tackle the constrained convex optimization problems that arise in this application. Our algorithms are rooted in the frameworks of the alternating direction method of multipliers and the coordinate descent method of multipliers, making them available for distributed data scenarios. Notably, in the decentralized topology, we introduce a distributed coordinate-wise descent algorithm that employs a group alternating direction method of multipliers to achieve efficient distributed regularized estimation. We rigorously present convergence analysis for our decentralized algorithm, ensuring its reliability for practical applications. Through numerical experiments on both simulated datasets and a real- world medical insurance dataset, we evaluate the performance of our proposed algorithms.

引用

页数：23

共 50 条

[21] ADMM for Penalized Quantile Regression in Big Data
Yu, Liqun
Lin, Nan
INTERNATIONAL STATISTICAL REVIEW, 2017, 85 (03) : 494 - 518
[22] Fitting survival data with penalized Poisson regression
Aris Perperoglou
Statistical Methods & Applications, 2011, 20 : 451 - 462
[23] Penalized quantile regression for dynamic panel data
Galvao, Antonio F.
Montes-Rojas, Gabriel V.
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2010, 140 (11) : 3476 - 3497
[24] Fitting survival data with penalized Poisson regression
Perperoglou, Aris
STATISTICAL METHODS AND APPLICATIONS, 2011, 20 (04): : 451 - 462
[25] Penalized local polynomial regression for spatial data
Wang, Wu
Sun, Ying
BIOMETRICS, 2019, 75 (04) : 1179 - 1190
[26] Classification of microarray data with penalized logistic regression
Eilers, PHC
Boer, JM
van Ommen, GJ
van Houwelingen, HC
MICROARRAYS: OPTICAL TECHNOLOGIES AND INFORMATICS, 2001, 4266 : 187 - 198
[27] Penalized expectile regression: an alternative to penalized quantile regression
Lina Liao
Cheolwoo Park
Hosik Choi
Annals of the Institute of Statistical Mathematics, 2019, 71 : 409 - 438
[28] Penalized expectile regression: an alternative to penalized quantile regression
Liao, Lina
Park, Cheolwoo
Choi, Hosik
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2019, 71 (02) : 409 - 438
[29] Distributed Optimization for Massive Connectivity
Jiang, Yuning
Su, Junyan
Shi, Yuanming
Houska, Boris
IEEE WIRELESS COMMUNICATIONS LETTERS, 2020, 9 (09) : 1412 - 1416
[30] Distributed statistical optimization for non-randomly stored big data with application to penalized learning
Wang, Kangning
Li, Shaomin
STATISTICS AND COMPUTING, 2023, 33 (03)

← 1 2 3 4 5 →