Stability and optimization error of stochastic gradient descent for pairwise learning

被引:9
|
作者
Shen, Wei [1 ]
Yang, Zhenhuan [2 ]
Ying, Yiming [2 ]
Yuan, Xiaoming [3 ]
机构
[1] Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Kowloon, Hong Kong, Peoples R China
[2] SUNY Albany, Dept Math & Stat, Albany, NY 12222 USA
[3] Univ Hong Kong, Dept Math, Hong Kong, Peoples R China
基金
美国国家科学基金会;
关键词
Stability; generalization; optimization error; stochastic gradient descent; pairwise learning; minimax statistical error; RANKING; BOUNDS; ALGORITHMS; AREA;
D O I
10.1142/S0219530519400062
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, we study the stability and its trade-off with optimization error for stochastic gradient descent (SOD) algorithms in the pairwise learning setting. Pairwise learning refers to a learning task which involves a loss function depending on pairs of instances among which notable examples are bipartite ranking, metric learning, area under ROC curve (AUC) maximization and minimum error entropy (MEE) principle. Our contribution is two-folded. Firstly, we establish the stability results for SGD for pairwise learning in the convex, strongly convex and non-convex settings, from which generalization errors can be naturally derived. Secondly, we establish the trade-off between stability and optimization error of SGD algorithms for pairwise learning. This is achieved by lower-bounding the sum of stability and optimization error by the minimax statistical error over a prescribed class of pairwise loss functions. From this fundamental trade-off, we obtain lower bounds for the optimization error of SGD algorithms and the excess expected risk over a class of pairwise losses. In addition, we illustrate our stability results by giving sonic specific examples of AUC maximization, metric learning and MEE.
引用
收藏
页码:887 / 927
页数:41
相关论文
共 50 条
  • [21] Comparison of the Stochastic Gradient Descent Based Optimization Techniques
    Yazan, Ersan
    Talu, M. Fatih
    [J]. 2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [22] BAYESIAN STOCHASTIC GRADIENT DESCENT FOR STOCHASTIC OPTIMIZATION WITH STREAMING INPUT DATA
    Liu, Tianyi
    Lin, Yifan
    Zhou, Enlu
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (01) : 389 - 418
  • [23] Stochastic Gradient Descent and Its Variants in Machine Learning
    Netrapalli, Praneeth
    [J]. JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 201 - 213
  • [24] Stochastic Gradient Descent with Polyak's Learning Rate
    Prazeres, Mariana
    Oberman, Adam M.
    [J]. JOURNAL OF SCIENTIFIC COMPUTING, 2021, 89 (01)
  • [25] From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
    Sekhari, Ayush
    Kale, Satyen
    Lee, Jason D.
    De Sa, Chris
    Sridharan, Karthik
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [26] Towards Learning Stochastic Population Models by Gradient Descent
    Kreikemeyer, Justin N.
    Andelfinger, Philipp
    Uhrmacher, Adelinde M.
    [J]. PROCEEDINGS OF THE 38TH ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACM SIGSIM-PADS 2024, 2024, : 88 - 92
  • [27] Stochastic Gradient Descent with Polyak’s Learning Rate
    Mariana Prazeres
    Adam M. Oberman
    [J]. Journal of Scientific Computing, 2021, 89
  • [28] Learning Rates for Stochastic Gradient Descent With Nonconvex Objectives
    Lei, Yunwen
    Tang, Ke
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4505 - 4511
  • [29] Recent Advances in Stochastic Gradient Descent in Deep Learning
    Tian, Yingjie
    Zhang, Yuqi
    Zhang, Haibin
    [J]. MATHEMATICS, 2023, 11 (03)
  • [30] Learning Stochastic Optimal Policies via Gradient Descent
    Massaroli, Stefano
    Poli, Michael
    Peluchetti, Stefano
    Park, Jinkyoo
    Yamashita, Atsushi
    Asama, Hajime
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 1094 - 1099