Semi-Distance Correlation and Its Applications

被引:0
|
作者
Zhong, Wei [1 ,2 ]
Li, Zhuoxi [2 ]
Guo, Wenwen [3 ]
Cui, Hengjian [3 ]
机构
[1] Xiamen Univ, MOE, WISE, Key Lab Econometr, Xiamen, Peoples R China
[2] Xiamen Univ, Dept Stat & Data Sci, SOE, Xiamen, Peoples R China
[3] Capital Normal Univ, Sch Math Sci, Beijing, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Groupwise variable screening; High dimensionality; Measures of dependence; Test of independence; SELECTION; MODELS; ASSOCIATION; DENSITY;
D O I
10.1080/01621459.2023.2284988
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a new measure of dependence between a categorical random variable and a random vector with potentially high dimensions, named semi-distance correlation. It is an interesting extension of distance correlation to accommodate the information of the categorical random variable. It equals zero if and only if the categorical random variable and the other random vector are independent. Two important applications of semi-distance correlation are considered. First, we develop a semi-distance independence test between a categorical random variable and a random vector and derive its asymptotic distributions. When the dimension of the random vector tends to infinity, we derive the explicit asymptotic normal distribution of the test statistic under the null hypothesis, which allows us to compute p-values in an efficient and fast way for high dimensional data. Second, we propose to use the semi-distance correlation as a marginal utility between the response and a group of covariates to do groupwise variable screening for ultrahigh dimensional classification problems. The sure screening property has also been established. Monte Carlo simulations and a real data application are presented to demonstrate the excellent finite sample property of the proposed procedures. A new R package semidist is also developed to implement the proposed methods. Supplementary materials for this article are available online.
引用
收藏
页码:2919 / 2933
页数:15
相关论文
共 50 条
  • [1] Semi-Distance Codes and Steiner Systems
    Hiro Ito
    Midori Kobayashi
    Gisaku Nakamura
    Graphs and Combinatorics, 2007, 23 : 283 - 290
  • [2] Semi-distance codes and Steiner systems
    Ito, Hiro
    Kobayashi, Midori
    Nakamura, Gisaku
    GRAPHS AND COMBINATORICS, 2007, 23 (Suppl 1) : 283 - 290
  • [3] A SEMI-DISTANCE AND PROXIMAL DISTANCE ASSOCIATED WITH SYMMETRIC CONE
    Miao, Xinhe
    Chen, Jein-Shan
    JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2022, 23 (02) : 241 - 250
  • [4] Coverage-based semi-distance between Horn clauses
    Markov, Z
    Marinchev, I
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, APPLICATIONS, PROCEEDINGS, 2000, 1904 : 331 - 339
  • [5] A non-Archimedean analogue of the Kobayashi semi-distance and its non-degeneracy on Abelian varieties
    Cherry, W
    ILLINOIS JOURNAL OF MATHEMATICS, 1996, 40 (01) : 123 - 140
  • [6] A novel semi-distance for measuring dissimilarities of curves with sharp local patterns
    Timmermans, Catherine
    von Sachs, Rainer
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2015, 160 : 35 - 50
  • [8] Guidance in Reading Strategies: A First Step Towards Autonomous Learning in a Semi-Distance Education Program
    Aguirre Morales, Jahir
    Ramos Holguin, Bertha
    PROFILE-ISSUES IN TEACHERS PROFESSIONAL DEVELOPMENT, 2009, 11 (01) : 41 - 56
  • [9] An Updated Literature Review of Distance Correlation and Its Applications to Time Series
    Edelmann, Dominic
    Fokianos, Konstantinos
    Pitsillou, Maria
    INTERNATIONAL STATISTICAL REVIEW, 2019, 87 (02) : 237 - 262
  • [10] Applications of distance correlation to time series
    Davis, Richard A.
    Matsui, Muneya
    Mikosch, Thomas
    Wan, Phyllis
    BERNOULLI, 2018, 24 (4A) : 3087 - 3116