A Reward Shaping Approach for Reserve Price Optimization using Deep Reinforcement Learning

被引:0
|
作者
Afshar, Reza Refaei [1 ]
Rhuggenaath, Jason [1 ]
Zhang, Yingqian [1 ]
Kaymak, Uzay [1 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
关键词
Real Time Bidding; Reinforcement Learning; Reward Shaping; Deep Learning;
D O I
10.1109/IJCNN52387.2021.9533817
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real Time Bidding is the process of selling and buying online advertisements in real time auctions. Real time auctions are performed in header bidding partners or ad exchanges to sell publishers' ad placements. Ad exchanges run second price auctions and a reserve price should be set for each ad placement or impression. This reserve price is normally determined by the bids of header bidding partners. However, ad exchange may outbid higher reserve prices and optimizing this value largely affects the revenue. In this paper, we propose a deep reinforcement learning approach for adjusting the reserve price of individual impressions using contextual information. Normally, ad exchanges do not return any information about the auction except the sold-unsold status. This binary feedback is not suitable for maximizing the revenue because it contains no explicit information about the revenue. In order to enrich the reward function, we develop a novel reward shaping approach to provide informative reward signal for the reinforcement learning agent. Based on this approach, different intervals of reserve price get different weights and the reward value of each interval is learned through a search procedure. Using a simulator, we test our method on a set of impressions. Results show superior performance of our proposed method in terms of revenue compared with the baselines.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Hindsight Reward Shaping in Deep Reinforcement Learning
    de Villiers, Byron
    Sabatta, Deon
    2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 653 - 659
  • [2] An Improvement on Mapless Navigation with Deep Reinforcement Learning: A Reward Shaping Approach
    Alipanah, Arezoo
    Moosavian, S. Ali A.
    2022 10TH RSI INTERNATIONAL CONFERENCE ON ROBOTICS AND MECHATRONICS (ICROM), 2022, : 261 - 266
  • [3] Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping
    Miranda, Victor R. F.
    Neto, Armando A.
    Freitas, Gustavo M.
    Mozelli, Leonardo A.
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024, 71 (06) : 6013 - 6020
  • [4] Using Natural Language for Reward Shaping in Reinforcement Learning
    Goyal, Prasoon
    Niekum, Scott
    Mooney, Raymond J.
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2385 - 2391
  • [5] Belief Reward Shaping in Reinforcement Learning
    Marom, Ofir
    Rosman, Benjamin
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3762 - 3769
  • [6] Reward Shaping in Episodic Reinforcement Learning
    Grzes, Marek
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 565 - 573
  • [7] Multigrid Reinforcement Learning with Reward Shaping
    Grzes, Marek
    Kudenko, Daniel
    ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 357 - 366
  • [8] Offline reward shaping with scaling human preference feedback for deep reinforcement learning
    Li, Jinfeng
    Luo, Biao
    Xu, Xiaodong
    Huang, Tingwen
    NEURAL NETWORKS, 2025, 181
  • [9] Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
    De Moor, Bram J.
    Gijsbrechts, Joren
    Boute, Robert N.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 301 (02) : 535 - 545
  • [10] Online VNF Placement using Deep Reinforcement Learning and Reward Constrained Policy Optimization
    Mohamed, Ramy
    Avgeris, Marios
    Leivadeas, Aris
    Lambadaris, Ioannis
    2024 IEEE INTERNATIONAL MEDITERRANEAN CONFERENCE ON COMMUNICATIONS AND NETWORKING, MEDITCOM 2024, 2024, : 269 - 274