Statistical mechanical analysis of sparse linear regression as a variable selection problem

被引:7
|
作者
Obuchi, Tomoyuki [1 ]
Nakanishi-Ohno, Yoshinori [2 ,3 ]
Okada, Masato [4 ]
Kabashima, Yoshiyuki [1 ]
机构
[1] Tokyo Inst Technol, Dept Math & Comp Sci, Meguro Ku, Ookayama 2-12-1, Tokyo 1528550, Japan
[2] Univ Tokyo, Grad Sch Arts & Sci, Meguro Ku, Komaba 3-8-1, Tokyo 1538902, Japan
[3] Japan Sci & Technol Agcy, Precursory Res Embryon Sci & Technol, Honcho 4-1-8, Kawaguchi, Saitama 3320012, Japan
[4] Univ Tokyo, Grad Sch Frontier Sci, Kashiwanoha 5-1-5, Kashiwa, Chiba 2778561, Japan
关键词
cavity and replica method; energy landscapes; statistical inference; UNCERTAINTY PRINCIPLES; RECOVERY; CDMA; PERFORMANCE; RELAXATION; ALGORITHM;
D O I
10.1088/1742-5468/aae02c
中图分类号
O3 [力学];
学科分类号
08 ; 0801 ;
摘要
An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value of fit error to given data which is assumed to be generated from a linear process using the design matrix. This yields the typical achievable limit of the fit error when solving a representative to problem and includes the presence of unfavourable phase transitions preventing local search algorithms from reaching the minimum-error configuration. The associated phase diagrams are presented. A noteworthy outcome of the phase diagrams is that there exists a wide parameter region where any phase transition is absent from the high temperature to the lowest temperature at which the minimum-error configuration or the ground state is reached. This implies that certain local search algorithms can find the ground state with moderate computational costs in that region. Another noteworthy result is the presence of the random first-order transition in the strong noise case. The theoretical evaluation of the entropy is confirmed by extensive numerical methods using the exchange Monte Carlo and the multi-histogram methods. Another numerical test based on a metaheuristic optimisation algorithm called simulated annealing is conducted, which well supports the theoretical predictions on the local search algorithms. In the successful region with no phase transition, the computational cost of the simulated annealing to reach the ground state is estimated as the third order polynomial of the model dimensionality.
引用
收藏
页数:41
相关论文
共 50 条
  • [1] Exhaustive Search for Sparse Variable Selection in Linear Regression
    Igarashi, Yasuhiko
    Takenaka, Hikaru
    Nakanishi-Ohno, Yoshinori
    Uemura, Makoto
    Ikeda, Shiro
    Okada, Masato
    [J]. JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 2018, 87 (04)
  • [2] Sparse linear regression in unions of bases via Bayesian variable selection
    Fevotte, Cedric
    Godsill, Simon J.
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (07) : 441 - 444
  • [3] Variable selection for sparse logistic regression
    Zanhua Yin
    [J]. Metrika, 2020, 83 : 821 - 836
  • [4] Variable selection for sparse logistic regression
    Yin, Zanhua
    [J]. METRIKA, 2020, 83 (07) : 821 - 836
  • [5] Variable selection in high-dimensional sparse multiresponse linear regression models
    Luo, Shan
    [J]. STATISTICAL PAPERS, 2020, 61 (03) : 1245 - 1267
  • [6] Variable selection in high-dimensional sparse multiresponse linear regression models
    Shan Luo
    [J]. Statistical Papers, 2020, 61 : 1245 - 1267
  • [7] On variable selection in linear regression
    Kabaila, P
    [J]. ECONOMETRIC THEORY, 2002, 18 (04) : 913 - 925
  • [8] Variable selection in linear regression
    Lindsey, Charles
    Sheather, Simon
    [J]. STATA JOURNAL, 2010, 10 (04): : 650 - 669
  • [9] A variable selection proposal for multiple linear regression analysis
    Steel, S. J.
    Uys, D. W.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2011, 81 (12) : 2095 - 2105
  • [10] Sparse neural network regression with variable selection
    Shin, Jae-Kyung
    Bak, Kwan-Young
    Koo, Ja-Yong
    [J]. COMPUTATIONAL INTELLIGENCE, 2022, 38 (06) : 2075 - 2094