Online Convex Optimization With Time-Varying Constraints and Bandit Feedback

被引:61
|
作者
Cao, Xuanyu [1 ]
Liu, K. J. Ray [2 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
[2] Univ Maryland, Dept Elect & Comp Engn, College Pk, MD 20742 USA
关键词
Bandit feedback; constrained optimization; online convex optimization (OCO); stochastic optimization; ALGORITHMS; REGRET;
D O I
10.1109/TAC.2018.2884653
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, online convex optimization problem with time-varying constraints is studied from the perspective of an agent taking sequential actions. Both the objective function and the constraint functions are dynamic and unknown a priori to the agent. We first consider the scenario of the gradient feedback, in which, the values and gradients of the objective function and constraint functions at the chosen action are revealed after an action is submitted. We propose a computationally efficient online algorithm, which only involves direct closed-form computations at each time instant. It is shown that the algorithm possesses sublinear regret with respect to the dynamic benchmark sequence and sublinear constraint violations, as long as the drift of the benchmark sequence is sublinear, or in other words, the underlying dynamic optimization problems do not vary too drastically. Furthermore, we investigate the scenario of the bandit feedback, in which, after an action is chosen, only the values of the objective function and the constraint functions at several random points close to the action are announced to the agent. A bandit version of the online algorithm is proposed and we also establish its sublinear expected regret and sublinear expected constraint violations under the assumption that the drift of the benchmark sequence is sublinear. Finally, two numerical examples, namely online quadratic programming and online logistic regression, are presented to corroborate the effectiveness of the proposed algorithms and to confirm the theoretical guarantees.
引用
收藏
页码:2665 / 2680
页数:16
相关论文
共 50 条
  • [21] STOCHASTIC CONVEX OPTIMIZATION WITH BANDIT FEEDBACK
    Agarwal, Alekh
    Foster, Dean P.
    Hsu, Daniel
    Kakade, Sham M.
    Rakhlin, Alexander
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2013, 23 (01) : 213 - 240
  • [22] Optimization Filters for Stochastic Time-Varying Convex Optimization
    Simonetto, Andrea
    Massioni, Paolo
    [J]. 2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
  • [23] Zhang neural network for online solution of time-varying convex quadratic program subject to time-varying linear-equality constraints
    Zhang, Yunong
    Li, Zhan
    [J]. PHYSICS LETTERS A, 2009, 373 (18-19) : 1639 - 1643
  • [24] Stabilization of linear time-varying systems with state and input constraints using convex optimization
    Feng Tan
    Mingzhe Hou
    Guangren Duan
    [J]. Journal of Systems Engineering and Electronics, 2016, 27 (03) : 649 - 655
  • [25] Stabilization of linear time-varying systems with state and input constraints using convex optimization
    Tan, Feng
    Hou, Mingzhe
    Duan, Guangren
    [J]. JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2016, 27 (03) : 649 - 655
  • [26] Online bandit convex optimisation with stochastic constraints via two-point feedback
    Yu, Jichi
    Li, Jueyou
    Chen, Guo
    [J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (10) : 2089 - 2105
  • [27] Online Trajectory Optimization Using Inexact Gradient Feedback for Time-Varying Environments
    Nutalapati, Mohan Krishna
    Bedi, Amrit Singh
    Rajawat, Ketan
    Coupechoux, Marceau
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 : 4824 - 4838
  • [28] Event-triggered distributed online convex optimization with delayed bandit feedback
    Xiong, Menghui
    Zhang, Baoyong
    Yuan, Deming
    Zhang, Yijun
    Chen, Jun
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2023, 445
  • [29] Online distributed stochastic learning algorithm for convex optimization in time-varying directed networks
    Li, Jueyou
    Gu, Chuanye
    Wu, Zhiyou
    [J]. NEUROCOMPUTING, 2020, 416 : 85 - 94
  • [30] Online Convex Optimization of Programmable Quantum Computers to Simulate Time-Varying Quantum Channels
    Chittoor, Hari Hara Suthan
    Simeone, Osvaldo
    Banchi, Leonardo
    Pirandola, Stefano
    [J]. 2023 IEEE INFORMATION THEORY WORKSHOP, ITW, 2023, : 175 - 180