Learning Best Response Strategies for Agents in Ad Exchanges

被引:0
|
作者
Gerakaris, Stavros [1 ]
Ramamoorthy, Subramanian [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Scotland
来源
关键词
Ad exchanges; Stochastic game; Censored observations; Harsanyi-Bellman Ad Hoc Coordination; Kaplan-Meier estimator;
D O I
10.1007/978-3-030-14174-5_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ad exchanges are widely used in platforms for online display advertising. Autonomous agents operating in these exchanges must learn policies for interacting profitably with a diverse, continually changing, but unknown market. We consider this problem from the perspective of a publisher, strategically interacting with an advertiser through a posted price mechanism. The learning problem for this agent is made difficult by the fact that information is censored, i.e., the publisher knows if an impression is sold but no other quantitative information. We address this problem using the Harsanyi-Bellman Ad Hoc Coordination (HBA) algorithm [1,3], which conceptualises this interaction in terms of a Stochastic Bayesian Game and arrives at optimal actions by best responding with respect to probabilistic beliefs maintained over a candidate set of opponent behaviour profiles. We adapt and apply HBA to the censored information setting of ad exchanges. Also, addressing the case of stochastic opponents, we devise a strategy based on a Kaplan-Meier estimator for opponent modelling. We evaluate the proposed method using simulations wherein we show that HBA-KM achieves substantially better competitive ratio and lower variance of return than baselines, including a Q-learning agent and a UCB-based online learning agent, and comparable to the offline optimal algorithm.
引用
收藏
页码:77 / 93
页数:17
相关论文
共 50 条
  • [31] Adaptive learning and p-best response sets
    J. Durieu
    P. Solal
    O. Tercieux
    International Journal of Game Theory, 2011, 40 : 735 - 747
  • [32] Strategic Best Response Fairness in Fair Machine Learning
    Shimao, Hajime
    Khern-am-nuai, Warut
    Kannan, Karthik
    Cohen, Maxime C.
    PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 664 - 664
  • [33] Strategic best-response learning in multiagent systems
    Banerjee, Bikramjit
    Peng, Jing
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2012, 24 (02) : 139 - 160
  • [34] Imitation learning and response facilitation in embodied agents
    Kopp, Stefan
    Graeser, Olaf
    INTELLIGENT VIRTUAL AGENTS, PROCEEDINGS, 2006, 4133 : 28 - 41
  • [35] WHICH AD PULLED BEST?
    不详
    JOURNALISM QUARTERLY, 1951, 28 (04): : 519 - 519
  • [36] WHICH AD PULLED BEST?
    Burd, Henry A.
    JOURNAL OF MARKETING, 1951, 16 (02) : 239 - 240
  • [37] EVALUATING SIMULATION METHODOLOGIES TO DETERMINE BEST STRATEGIES TO MAXIMIZE STUDENT LEARNING
    Scherer, Yvonne K.
    Foltz-Ramos, Kelly
    Fabry, Donna
    Chao, Ying-Yu
    JOURNAL OF PROFESSIONAL NURSING, 2016, 32 (05) : 349 - 357
  • [38] Video scene analysis using best basis wavelets and learning strategies
    Umamaheswaran, D
    Huang, J
    Palakal, M
    Suyut, S
    PROCEEDINGS OF THE 6TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2002, : 680 - 683
  • [39] BEST ZONE PAPER: IMPLEMENTING SOCIAL LEARNING STRATEGIES: TEAM TESTING
    Bates, Rebecca A.
    Petersen, Andrew
    2012 ASEE ANNUAL CONFERENCE, 2012,
  • [40] FOSTERING STUDENT ENGAGEMENT IN BLENDED LEARNING: CREATIVE STRATEGIES AND BEST PRACTICES
    Gargano, Terra
    Throop, Julia
    ICERI2016: 9TH INTERNATIONAL CONFERENCE OF EDUCATION, RESEARCH AND INNOVATION, 2016, : 4501 - 4507