Online Second Price Auction with Semi-Bandit Feedback under the Non-Stationary Setting

被引:0
|
作者
Zhao, Haoyu [1 ]
Chen, Wei [2 ]
机构
[1] Tsinghua Univ, IIIS, Beijing, Peoples R China
[2] Microsoft Res, Beijing, Peoples R China
关键词
REGRET;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study the non-stationary online second price auction problem. We assume that the seller is selling the same type of items in T rounds by the second price auction, and she can set the reserve price in each round. In each round, the bidders draw their private values from a joint distribution unknown to the seller. Then, the seller announced the reserve price in this round. Next, bidders with private values higher than the announced reserve price in that round will report their values to the seller as their bids. The bidder with the highest bid larger than the reserved price would win the item and she will pay to the seller the price equal to the second-highest bid or the reserve price, whichever is larger. The seller wants to maximize her total revenue during the time horizon T while learning the distribution of private values over time. The problem is more challenging than the standard online learning scenario since the private value distribution is non-stationary, meaning that the distribution of bidders' private values may change over time, and we need to use the non-stationary regret to measure the performance of our algorithm. To our knowledge, this paper is the first to study the repeated auction in the non-stationary setting theoretically. Our algorithm achieves the non-stationary regret upper bound (O) over tilde (min{root ST, V-1/3 T-2/3}), where S is the number of switches in the distribution, and <overline>V is the sum of total variation, and S and (V) over bar are not needed to be known by the algorithm. We also prove regret lower bounds Omega(root ST) in the switching case and Omega((V) over bar (1/3) T-2/3) in the dynamic case, showing that our algorithm has nearly optimal non-stationary regret.
引用
收藏
页码:6893 / 6900
页数:8
相关论文
共 14 条
  • [1] Non-Stationary Delayed Combinatorial Semi-Bandit With Causally Related Rewards
    Ghoorchian, Saeed
    Bilaj, Steven
    Maghsudi, Setareh
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2025, 6 : 369 - 384
  • [2] Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
    Wen, Zheng
    Kveton, Branislav
    Valko, Michal
    Vaswani, Sharan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [3] ONLINE LEARNING FOR COMPUTATION PEER OFFLOADING WITH SEMI-BANDIT FEEDBACK
    Zhu, Hongbin
    Kang, Kai
    Luo, Xiliang
    Qian, Hua
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4524 - 4528
  • [4] Learning from Delayed Semi-Bandit Feedback under Strong Fairness Guarantees
    Steiger, Juaren
    Li, Bin
    Lu, Ning
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 1379 - 1388
  • [5] Non-Stationary Bandit Strategy for Rate Adaptation With Delayed Feedback
    Zhao, Yapeng
    Qian, Hua
    Kang, Kai
    Jin, Yanliang
    IEEE ACCESS, 2020, 8 : 75503 - 75511
  • [6] Non-Stationary Bandit Strategy for Rate Adaptation with Delayed Feedback
    Zhao, Yapeng
    Qian, Hua
    Kang, Kai
    Jin, Yanliang
    IEEE Access, 2020, 8 : 75503 - 75511
  • [7] Finding Optimal Arms in Non-stochastic Combinatorial Bandits with Semi-bandit Feedback and Finite Budget
    Brandt, Jasmin
    Bengs, Viktor
    Haddenhorst, Bjoern
    Huellermeier, Eyke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Adversarial Network Optimization under Bandit Feedback: Maximizing Utility in Non-Stationary Multi-Hop Networks
    Dai, Yan
    Huang, Longbo
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2024, 8 (03)
  • [9] Second-order non-stationary online learning for regression
    Moroshko, Edward
    Vaits, Nina
    Crammer, Koby
    Journal of Machine Learning Research, 2015, 16 : 1481 - 1517
  • [10] Second-Order Non-Stationary Online Learning for Regression
    Moroshko, Edward
    Vaits, Nina
    Crammer, Koby
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 1481 - 1517