A stochastic gradient descent algorithm for structural risk minimisation

被引:0
|
作者
Ratsaby, J [1 ]
机构
[1] UCL, London WC1E 6BT, England
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Structural risk minimisation (SRM) is a general complexity regularization method which automatically selects the model complexity that approximately minimises the misclassification error probability of the empirical risk minimiser. It does so by adding a complexity penalty term epsilon(m, k) to the empirical risk of the candidate hypotheses and then for any fixed sample size m it minimises the sum with respect to the model complexity variable k. When learning multicategory classification there are M subsamples m(i), corresponding to the M pattern classes with a priori probabilities p(i), 1 less than or equal to i less than or equal to M. Using the usual representation for a multi-category classifier as M individual boolean classifiers, the penalty becomes Sigma(i=1)(M) P(i)epsilon(m(i), k(i)). If the m(i) are given then the standard SRM trivially applies here by minimizing the penalised empirical risk with respect to k(i),1,..., M. However, in situations where the total sample size Sigma(i=1)(M) m(i), needs to be minimal one needs to also minimize the penalised empirical risk with respect to the variables mi, i = 1,..., M. The obvious problem is that the empirical risk can only be defined after the subsamples (and hence their sizes) are given (known). Utilising an on-line stochastic gradient descent approach, this paper overcomes this difficulty and introduces a sample-querying algorithm which extends the standard SRM principle. It minimises the penalised empirical risk not only with respect to the ki, as the standard SRM does, but also with respect to the m(i,) i = 1,...,M. The challenge here is in defining a stochastic empirical criterion which when minimised yields a sequence of subsample-size vectors which asymptotically achieve the Bayes' optimal error convergence rate.
引用
收藏
页码:205 / 220
页数:16
相关论文
共 50 条
  • [41] Implementation of Stochastic Parallel Gradient Descent Algorithm for Coherent Beam Combining
    Linslal, C. L.
    Sooraj, M. S.
    Padmanabhan, A.
    Venkitesh, D.
    Srinivasan, B.
    HIGH-POWER LASERS AND APPLICATIONS IX, 2018, 10811
  • [42] Using the Stochastic Gradient Descent Optimization Algorithm on Estimating of Reactivity Ratios
    Fazakas-Anca, Iosif Sorin
    Modrea, Arina
    Vlase, Sorin
    MATERIALS, 2021, 14 (16)
  • [43] Stochastic gradient descent algorithm preserving differential privacy in MapReduce framework
    Yu Y.
    Fu Y.
    Wu X.
    Tongxin Xuebao/Journal on Communications, 2018, 39 (01): : 70 - 77
  • [44] A Stochastic Parallel Gradient Descent Algorithm for Person Re-identification
    Cheng, Keyang
    Tao, Fei
    2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [45] Improved method of stochastic parallel gradient descent algorithm with global coupling
    Jiang, Pengzhi
    Liang, Yonghui
    Xu, Jieping
    Mao, Hongjun
    Guangxue Xuebao/Acta Optica Sinica, 2014, 34
  • [46] The Minimization of Empirical Risk Through Stochastic Gradient Descent with Momentum Algorithms
    Chaudhuri, Arindam
    ARTIFICIAL INTELLIGENCE METHODS IN INTELLIGENT ALGORITHMS, 2019, 985 : 168 - 181
  • [47] Field detection of small pests through stochastic gradient descent with genetic algorithm
    Ye, Yin
    Huang, Qiangqiang
    Rong, Yi
    Yu, Xiaohan
    Liang, Weiji
    Chen, Yaxiong
    Xiong, Shengwu
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 206
  • [48] On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression
    Cohen, Kobi
    Nedic, Angelia
    Srikant, R.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (11) : 5974 - 5981
  • [49] Deep learning for sea cucumber detection using stochastic gradient descent algorithm
    Zhang, Huaqiang
    Yu, Fusheng
    Sun, Jincheng
    Shen, Xiaoqin
    Li, Kun
    EUROPEAN JOURNAL OF REMOTE SENSING, 2020, 53 (53-62) : 53 - 62
  • [50] An asynchronous distributed training algorithm based on Gossip communication and Stochastic Gradient Descent
    Tu, Jun
    Zhou, Jia
    Ren, Donglin
    COMPUTER COMMUNICATIONS, 2022, 195 : 416 - 423