A joint estimation for the high-dimensional regression modeling on stratified data

被引:0
|
作者
Gao, Yimiao [1 ]
Yang, Yuehan [1 ]
机构
[1] Cent Univ Finance & Econ, Sch Stat & Math, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; DBSCAN; penalization; Lasso; Regression; Stratified analysis; SELECTION; ALGORITHM; SPARSITY; DBSCAN; LASSO;
D O I
10.1080/03610918.2021.2008435
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper considers the estimation of regression models when data is collected in a stratified mode using a categorical variable. This kind of data appears in fields frequently since data is collected from various sources. Most of the literature analyzes the data assuming that the stratified information is known, while this information is not always attainable. In this paper, we assume the stratified information is unknown. The proposed joint estimation combines the clustering technique and penalized regression modeling, so that it can be applied to high-dimensional stratified data without specific information. We show that the proposed method enjoys asymptotic properties. Simulations and empirical studies confirm that our method outperforms the methods without stratification. We apply the proposed method to gene expression data and temperature data, obtaining some meaningful results.
引用
收藏
页码:6129 / 6140
页数:12
相关论文
共 50 条
  • [1] The joint lasso: high-dimensional regression for group structured data
    Dondelinger, Frank
    Mukherjee, Sach
    [J]. BIOSTATISTICS, 2020, 21 (02) : 219 - 235
  • [2] DOUBLY PENALIZED ESTIMATION IN ADDITIVE REGRESSION WITH HIGH-DIMENSIONAL DATA
    Tan, Zhiqiang
    Zhang, Cun-Hui
    [J]. ANNALS OF STATISTICS, 2019, 47 (05): : 2567 - 2600
  • [3] Factor Analysis Regression for Predictive Modeling with High-Dimensional Data
    Carter, Randy
    Michael, Netsanet
    [J]. JOURNAL OF QUANTITATIVE ECONOMICS, 2022, 20 (SUPPL 1) : 115 - 132
  • [4] Factor Analysis Regression for Predictive Modeling with High-Dimensional Data
    Randy Carter
    Netsanet Michael
    [J]. Journal of Quantitative Economics, 2022, 20 : 115 - 132
  • [5] Converting high-dimensional regression to high-dimensional conditional density estimation
    Izbicki, Rafael
    Lee, Ann B.
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 2800 - 2831
  • [6] Regression-Based Network Estimation for High-Dimensional Genetic Data
    Lee, Kyu Min
    Lee, Minhyeok
    Seok, Junhee
    Han, Sung Won
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2019, 26 (04) : 336 - 349
  • [7] Modeling High-Dimensional Data
    Vempala, Santosh S.
    [J]. COMMUNICATIONS OF THE ACM, 2012, 55 (02) : 112 - 112
  • [8] Variance estimation for high-dimensional regression models
    Spokoiny, V
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2002, 82 (01) : 111 - 133
  • [9] Estimation of semiparametric regression model with right-censored high-dimensional data
    Aydin, Dursun
    Ahmed, S. Ejaz
    Yilmaz, Ersin
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (06) : 985 - 1004
  • [10] Unconditional quantile regression with high-dimensional data
    Sasaki, Yuya
    Ura, Takuya
    Zhang, Yichong
    [J]. QUANTITATIVE ECONOMICS, 2022, 13 (03) : 955 - 978