Reinforcement Learning with Sequential Information Clustering in Real-Time Bidding

被引:12
|
作者
Lu, Junwei [1 ,3 ]
Yang, Chaoqi [1 ,3 ]
Gao, Xiaofeng [1 ,3 ]
Wang, Liubin [2 ]
Li, Changcheng [2 ]
Chen, Guihai [1 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Tencent, Shenzhen, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai Key Lab Scalable Comp & Syst, Shanghai, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Display Advertising; Reinforcement Learning; Clustering;
D O I
10.1145/3357384.3358027
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Display advertising is a billion dollar business which is the primary income of many companies. In this scenario, real-time bidding optimization is one of the most important problems, where the bids of ads for each impression are determined by an intelligent policy such that some global key performance indicators are optimized. Due to the highly dynamic bidding environment, many recent works try to use reinforcement learning algorithms to train the bidding agents. However, as the probability of the occurrence of a particular state is typically low and the state representation in current work lacks sequential information, the convergence speed and performance of deep reinforcement algorithms are disappointing. To tackle these two challenges in the real-time bidding scenario, we propose ClusterA3C, a novel Advantage Asynchronous Actor-Critic (A3C) variant integrated with a sequential information extraction scheme and a clustering based state aggregation scheme. We conduct extensive experiments to validate the proposed scheme on a real-world commercial dataset. Experimental results show that the proposed scheme outperforms the state of the art methods in terms of either performance or convergence speed.
引用
收藏
页码:1633 / 1641
页数:9
相关论文
共 50 条
  • [1] Real-Time Bidding by Reinforcement Learning in Display Advertising
    Cai, Han
    Ren, Kan
    Zhang, Weinan
    Malialis, Kleanthis
    Wang, Jun
    Yu, Yong
    Guo, Defeng
    [J]. WSDM'17: PROCEEDINGS OF THE TENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2017, : 661 - 670
  • [2] Deep Reinforcement Learning for Sponsored Search Real-time Bidding
    Zhao, Jun
    Qiu, Guang
    Guan, Ziyu
    Zhao, Wei
    He, Xiaofei
    [J]. KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1021 - 1030
  • [3] Reinforcement Learning Method for Ad Networks Ordering in Real-Time Bidding
    Afshar, Reza Refaei
    Zhang, Yingqian
    Firat, Murat
    Kaymak, Uzay
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2019, 2019, 11978 : 16 - 36
  • [4] Real-Time Bidding with Side Information
    Flajolet, Arthur
    Jaillet, Patrick
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [5] Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising
    Jin, Junqi
    Song, Chengru
    Li, Han
    Gai, Kun
    Wang, Jun
    Zhang, Weinan
    [J]. CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 2193 - 2201
  • [6] An extensible approach for real-time bidding with model-free reinforcement learning
    Cheng, Yin
    Zou, Luobao
    Zhuang, Zhiwei
    Liu, Jingwei
    Xu, Bin
    Zhang, Weidong
    [J]. NEUROCOMPUTING, 2019, 360 : 97 - 106
  • [7] An Intelligent Bidding Strategy Based on Model-Free Reinforcement Learning for Real-Time Bidding in Display Advertising
    Liu, Mengjuan
    Li, Jiaxing
    Yue, Wei
    Qiu, Lizhou
    Liu, Jinyu
    Qin, Zhiguang
    [J]. 2019 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2019, : 240 - 245
  • [8] Real-Time Bidding with Soft Actor-Critic Reinforcement Learning in Display Advertising
    Yakovleva, Dania
    Popov, Artem
    Filchenkov, Andrey
    [J]. PROCEEDINGS OF THE 2019 25TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 373 - 382
  • [9] Deep Reinforcement Learning Based Real-Time Renewable Energy Bidding with Battery Control
    Jeong, Jaeik
    Kim, Seung Wan
    Kim, Hongseok
    [J]. IEEE Transactions on Energy Markets, Policy and Regulation, 2023, 1 (02): : 85 - 96
  • [10] Real-Time Reinforcement Learning
    Ramstedt, Simon
    Pal, Christopher
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32