A stochastic gradient descent algorithm for structural risk minimisation

被引：0

作者：

Ratsaby, J ^{[1
]}

机构：

[1] UCL, London WC1E 6BT, England

来源：

ALGORITHMIC LEARNING THEORY, PROCEEDINGS | 2003年 / 2842卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Structural risk minimisation (SRM) is a general complexity regularization method which automatically selects the model complexity that approximately minimises the misclassification error probability of the empirical risk minimiser. It does so by adding a complexity penalty term epsilon(m, k) to the empirical risk of the candidate hypotheses and then for any fixed sample size m it minimises the sum with respect to the model complexity variable k. When learning multicategory classification there are M subsamples m(i), corresponding to the M pattern classes with a priori probabilities p(i), 1 less than or equal to i less than or equal to M. Using the usual representation for a multi-category classifier as M individual boolean classifiers, the penalty becomes Sigma(i=1)(M) P(i)epsilon(m(i), k(i)). If the m(i) are given then the standard SRM trivially applies here by minimizing the penalised empirical risk with respect to k(i),1,..., M. However, in situations where the total sample size Sigma(i=1)(M) m(i), needs to be minimal one needs to also minimize the penalised empirical risk with respect to the variables mi, i = 1,..., M. The obvious problem is that the empirical risk can only be defined after the subsamples (and hence their sizes) are given (known). Utilising an on-line stochastic gradient descent approach, this paper overcomes this difficulty and introduces a sample-querying algorithm which extends the standard SRM principle. It minimises the penalised empirical risk not only with respect to the ki, as the standard SRM does, but also with respect to the m(i,) i = 1,...,M. The challenge here is in defining a stochastic empirical criterion which when minimised yields a sequence of subsample-size vectors which asymptotically achieve the Bayes' optimal error convergence rate.

引用

页码：205 / 220

页数：16

共 50 条

[1] A stochastic multiple gradient descent algorithm
Mercier, Quentin
Poirion, Fabrice
Desideri, Jean-Antoine
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 271 (03) : 808 - 817
[2] Optimal stochastic gradient descent algorithm for filtering
Turali, M. Yigit
Koc, Ali T.
Kozat, Suleyman S.
DIGITAL SIGNAL PROCESSING, 2024, 155
[3] A new stochastic gradient descent possibilistic clustering algorithm
Koutsimpela, Angeliki
Koutroumbas, Konstantinos D.
AI COMMUNICATIONS, 2022, 35 (02) : 47 - 64
[4] Fast Convergence Stochastic Parallel Gradient Descent Algorithm
Hu Dongting
Shen Wen
Ma Wenchao
Liu Xinyu
Su Zhouping
Zhu Huaxin
Zhang Xiumei
Que Lizhi
Zhu Zhuowei
Zhang Yixin
Chen Guoqing
Hu Lifa
LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (12)
[5] The Improved Stochastic Fractional Order Gradient Descent Algorithm
Yang, Yang
Mo, Lipo
Hu, Yusen
Long, Fei
FRACTAL AND FRACTIONAL, 2023, 7 (08)
[6] Guided Stochastic Gradient Descent Algorithm for inconsistent datasets
Sharma, Anuraganand
APPLIED SOFT COMPUTING, 2018, 73 : 1068 - 1080
[7] Stochastic Approximate Gradient Descent via the Langevin Algorithm
Qiu, Yixuan
Wang, Xiao
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5428 - 5435
[8] Convergence behavior of diffusion stochastic gradient descent algorithm
Barani, Fatemeh
Savadi, Abdorreza
Yazdi, Hadi Sadoghi
SIGNAL PROCESSING, 2021, 183
[9] STOCHASTIC GRADIENT DESCENT ALGORITHM FOR STOCHASTIC OPTIMIZATION IN SOLVING ANALYTIC CONTINUATION PROBLEMS
Bao, Feng
Maier, Thomas
FOUNDATIONS OF DATA SCIENCE, 2020, 2 (01): : 1 - 17
[10] Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm
Zhu, Miaoxi
Shen, Li
Du, Bo
Tao, Dacheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →