DISTRIBUTED SUFFICIENT DIMENSION REDUCTION FOR HETEROGENEOUS MASSIVE DATA

被引:4
|
作者
Xu, Kelin [1 ]
Zhu, Liping [2 ,3 ]
Fan, Jianqing [4 ]
机构
[1] Fudan Univ, Sch Publ Hlth, Shanghai, Peoples R China
[2] Renmin Univ China, Ctr Appl Stat, Beijing, Peoples R China
[3] Renmin Univ China, Inst Stat & Big Data, Beijing, Peoples R China
[4] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ USA
基金
北京市自然科学基金;
关键词
Cumulative slicing estimation; distributed estimation; het-erogeneity; sliced inverse regression; sufficient dimension reduction; SLICED INVERSE REGRESSION; CONFIDENCE-INTERVALS; ASYMPTOTICS;
D O I
10.5705/ss.202021.0031
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a distributed sufficient dimension reduction to process massive data characterized by high dimensionality, a huge sample size, and heterogeneity (heterogeneity, and huge sample sizes). To address the high dimensionality, we replace the high-dimensional explanatory variables with a small number of linear projections that are sufficient to explain the variabilities of the response variable. We allow for distinctive function maps for data scattered at different locations, thus addressing the problem of heterogeneity. We assume that the dimension reduction subspaces at different local nodes are identical. This allows us to aggregate the local results obtained from each local node to yield a final estimate on a central server. We explicitly examine the sliced inverse regression and cumulative slicing estimation, and investigate the nonasymptotic error bounds of the resulting dimensionality reduction. Our theoretical results are further supported by simulation studies and an application to meta-genome data from the American Gut Project.
引用
收藏
页码:2455 / 2476
页数:22
相关论文
共 50 条
  • [41] Sufficient dimension reduction and graphics in regression
    Chiaromonte, F
    Cook, RD
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2002, 54 (04) : 768 - 795
  • [42] A Note on Bootstrapping in Sufficient Dimension Reduction
    Yoo, Jae Keun
    Jeong, Sun
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2015, 22 (03) : 285 - 294
  • [43] Sufficient Dimension Reduction and Graphics in Regression
    Francesca Chiaromonte
    R. Dennis Cook
    Annals of the Institute of Statistical Mathematics, 2002, 54 : 768 - 795
  • [44] Sufficient dimension reduction and prediction in regression
    University of Minnesota, School of Statistics, 313 Ford Hall, 224 Church Street Southeast, Minneapolis, MN 55455, United States
    Philos. Trans. R. Soc. A Math. Phys. Eng. Sci., 1600, 1906 (4385-4405):
  • [45] Sparse kernel sufficient dimension reduction
    Liu, Bingyuan
    Xue, Lingzhou
    JOURNAL OF NONPARAMETRIC STATISTICS, 2024,
  • [46] Using noise filtering and sufficient dimension reduction method on unstructured economic data
    Yoo, Jae Keun
    Park, Yujin
    Seo, Beomseok
    KOREAN JOURNAL OF APPLIED STATISTICS, 2024, 37 (02) : 119 - 138
  • [47] Principal minimax support vector machine for sufficient dimension reduction with contaminated data
    Zhou, Jingke
    Zhu, Lixing
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2016, 94 : 33 - 48
  • [48] Sufficient dimension reduction for survival data analysis with error-prone variables
    Chen, Li-Pang
    Yi, Grace Y.
    ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (01): : 2082 - 2123
  • [49] On sufficient dimension reduction for functional data: Inverse moment-based methods
    Song, Jun
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2019, 11 (04):
  • [50] Integrative Sufficient Dimension Reduction Methods for Multi-Omics Data Analysis
    Jain, Yashita
    Ding, Shanshan
    ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 616 - 616