Variable Selection based on Maximum Information Coefficient for Data Modeling

被引:0
|
作者
Chu, Fuchang [1 ]
Fan, Zhenping [1 ]
Guo, Baohui [1 ]
Zhi, Dan [1 ]
Yin, Zijian [1 ]
Zhao, Wenjie [1 ]
机构
[1] North China Elect Power Univ, Coll Automat, Baoding, Peoples R China
关键词
mutual information; variable selection; maximum information coefficient; MUTUAL INFORMATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Whether the variable selection is accurate or not affect the accuracy and generalization ability of the model. The traditional variable selection method is difficult to maintain a high stability under high collinearity. In order to solve the problem, we propose a new method MICFS (Feature Select based on Maximal Information Coefficient), which combines the maximum information coefficient with the existing mutual information variable selection method. Firstly, this paper introduces the theory of mutual information and the variable selection algorithm based on mutual information, and then use the maximum information coefficient instead of the original mutual information criterion. Finally, the validity of method is verified by using the Friedman data set. The result shows that this method can meet the requirements of variable selection in a high collinearity and high noise environment.
引用
收藏
页码:1714 / 1717
页数:4
相关论文
共 50 条
  • [31] Variable selection in robust semiparametric modeling for longitudinal data
    Wang, Kangning
    Lin, Lu
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2014, 43 (02) : 303 - 314
  • [32] Split variable selection for tree modeling on rank data
    Kung, Yi-Hung
    Lin, Chang-Ting
    Shih, Yu-Shan
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (09) : 2830 - 2836
  • [33] Variable selection in robust semiparametric modeling for longitudinal data
    Kangning Wang
    Lu Lin
    Journal of the Korean Statistical Society, 2014, 43 : 303 - 314
  • [34] Variable Selection in Semiparametric Quantile Modeling for Longitudinal Data
    Wang, Kangning
    Lin, Lu
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2015, 44 (11) : 2243 - 2266
  • [35] Feature selection for IoT based on maximal information coefficient
    Sun, Guanglu
    Li, Jiabin
    Dai, Jian
    Song, Zhichao
    Lang, Fei
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 89 : 606 - 616
  • [36] Feature selection based on mutual information with correlation coefficient
    Hongfang Zhou
    Xiqian Wang
    Rourou Zhu
    Applied Intelligence, 2022, 52 : 5457 - 5474
  • [37] Feature selection based on mutual information with correlation coefficient
    Zhou, Hongfang
    Wang, Xiqian
    Zhu, Rourou
    APPLIED INTELLIGENCE, 2022, 52 (05) : 5457 - 5474
  • [38] Pseudo-likelihood-based Bayesian information criterion for variable selection in survey data
    Xu, Chen
    Chen, Jiahua
    Mantel, Harold
    SURVEY METHODOLOGY, 2013, 39 (02) : 303 - 321
  • [39] Variable selection for semiparametric varying coefficient partially linear model based on modal regression with missing data
    Xia, Yafeng
    Qu, Yarong
    Sun, Nailing
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (20) : 5121 - 5137
  • [40] Variable selection for partially varying coefficient model based on modal regression under high dimensional data
    Xia, Yafeng
    Zhang, Lirong
    Zhang, Aiping
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (01) : 232 - 248