A sparse version of the ridge logistic regression for large-scale text categorization

被引:28
|
作者
Aseervatham, Sujeevan [1 ]
Antoniadis, Anestis [2 ]
Gaussier, Eric [1 ]
Burlet, Michel [3 ]
Denneulin, Yves [4 ]
机构
[1] Univ Grenoble 1, LIG, F-38041 Grenoble 9, France
[2] Univ Grenoble 1, LJK, F-38041 Grenoble 9, France
[3] Univ Grenoble 1, Lab Leibniz, F-38031 Grenoble 1, France
[4] ENSIMAG, LIG, F-38330 Montbonnot St Martin, France
关键词
Logistic regression; Model selection; Text categorization; Large scale ategorization; REGULARIZATION; SELECTION; MODEL;
D O I
10.1016/j.patrec.2010.09.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ridge logistic regression has successfully been used in text categorization problems and It has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score However the dense solution of the ridge makes its use unpractical for large scale categorization On the other side LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated In this paper we propose a new model selection method which tries to approach the ridge solution by a sparse solution The method first computes the ridge solution and then performs feature selection The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:101 / 106
页数:6
相关论文
共 50 条
  • [21] Efficient Vertical Federated Learning Method for Ridge Regression of Large-Scale Samples
    Cai, Jianping
    Liu, Ximeng
    Yu, Zhiyong
    Guo, Kun
    Li, Jiayin
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2023, 11 (02) : 511 - 526
  • [22] Detection of redundant traffic in large-scale communication networks based on logistic regression
    Wen X.
    Huang L.
    Zheng Y.
    Zhao H.
    International Journal of Reasoning-based Intelligent Systems, 2024, 16 (01) : 8 - 15
  • [23] Large-Scale Elastic Net Regularized Linear Classification SVMs and Logistic Regression
    Balamurugan, P.
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 949 - 954
  • [24] Large-scale Logistic Regression and Linear Support Vector Machines Using Spark
    Lin, Chieh-Yen
    Tsai, Cheng-Hao
    Lee, Ching-Pei
    Lin, Chih-Jen
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 519 - 528
  • [25] Improving Large-Scale k-Nearest Neighbor Text Categorization with Label Autoencoders
    Ribadas-Pena, Francisco J.
    Cao, Shuyuan
    Darriba Bilbao, Victor M.
    MATHEMATICS, 2022, 10 (16)
  • [26] A Novel Clustering Algorithm and Its Incremental Version for Large-Scale Text Collection
    Chen, Lei
    Liu, Ming
    Wu, Chong
    Xu, Ai
    INFORMATION TECHNOLOGY AND CONTROL, 2016, 45 (02): : 136 - 147
  • [27] Transient stability assessment in large-scale power systems using sparse logistic classifiers
    Lv, Jiaqing
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 136
  • [28] Quantile regression for large-scale data via sparse exponential transform method
    Xu, Q. F.
    Cai, C.
    Jiang, C. X.
    Huang, X.
    STATISTICS, 2019, 53 (01) : 26 - 42
  • [29] To tune or not to tune, a case study of ridge logistic regression in small or sparse datasets
    Hana Šinkovec
    Georg Heinze
    Rok Blagus
    Angelika Geroldinger
    BMC Medical Research Methodology, 21
  • [30] To tune or not to tune, a case study of ridge logistic regression in small or sparse datasets
    Sinkovec, Hana
    Heinze, Georg
    Blagus, Rok
    Geroldinger, Angelika
    BMC MEDICAL RESEARCH METHODOLOGY, 2021, 21 (01)