Conversational Contextual Bandit: Algorithm and Application

被引:52
|
作者
Zhang, Xiaoying [1 ]
Xie, Hong [2 ]
Li, Hang [3 ]
Lui, John C. S. [1 ]
机构
[1] Chinese Univ Hong Kong, CSE, Hong Kong, Peoples R China
[2] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
[3] Bytedance, AI Lab, Beijing, Peoples R China
关键词
D O I
10.1145/3366423.3380148
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Contextual bandit algorithms provide principled online learning solutions to balance the exploitation-exploration trade-off in various applications such as recommender systems. However, the learning speed of the traditional contextual bandit algorithms is often slow due to the need for extensive exploration. This poses a critical issue in applications like recommender systems, since users may need to provide feedbacks on a lot of uninterested items. To accelerate the learning speed, we generalize contextual bandit to conversational contextual bandit. Conversational contextual bandit leverages not only behavioral feedbacks on arms (e.g., articles in news recommendation), but also occasional conversational feedbacks on key-terms from the user. Here, a key-term can relate to a subset of arms, for example, a category of articles in news recommendation. We then design the Conversational UCB algorithm (ConUCB) to address two challenges in conversational contextual bandit: (1) which key-terms to select to conduct conversation, (2) how to leverage conversational feedbacks to accelerate the speed of bandit learning. We theoretically prove that ConUCB can achieve a smaller regret upper bound than the traditional contextual bandit algorithm LinUCB, which implies a faster learning speed. Experiments on synthetic data, as well as real datasets from Yelp and Toutiao, demonstrate the efficacy of the ConUCB algorithm.
引用
收藏
页码:662 / 672
页数:11
相关论文
共 50 条
  • [1] Robust and efficient algorithms for conversational contextual bandit
    Gu, Haoran
    Xia, Yunni
    Xie, Hong
    Shi, Xiaoyu
    Shang, Mingsheng
    [J]. INFORMATION SCIENCES, 2024, 657
  • [2] Contextual Dependent Click Bandit Algorithm for Web Recommendation
    Liu, Weiwen
    Li, Shuai
    Zhang, Shengyu
    [J]. COMPUTING AND COMBINATORICS (COCOON 2018), 2018, 10976 : 39 - 50
  • [3] Expert Features for a Student Support Recommendation Contextual Bandit Algorithm
    Lee, Morgan P.
    Siedahmed, Abubakir
    Heffernan, Neil T.
    [J]. FOURTEENTH INTERNATIONAL CONFERENCE ON LEARNING ANALYTICS & KNOWLEDGE, LAK 2024, 2024, : 864 - 870
  • [4] Asymptotically Optimal Contextual Bandit Algorithm Using Hierarchical Structures
    Neyshabouri, Mohammadreza Mohaghegh
    Gokcesu, Kaan
    Gokcesu, Hakan
    Ozkan, Huseyin
    Kozat, Suleyman Serdar
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 923 - 937
  • [5] A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem
    Kannan, Sampath
    Morgenstern, Jamie
    Roth, Aaron
    Waggoner, Bo
    Wu, Zhiwei Steven
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [6] Contextual Bandit Algorithm for Risk-Aware Recommender Systems
    Bouneffouf, Djallel
    [J]. 2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 4667 - 4674
  • [7] A tractable online learning algorithm for the multinomial logit contextual bandit
    Agrawal, Priyank
    Tulabandhula, Theja
    Avadhanula, Vashist
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 310 (02) : 737 - 750
  • [8] Constrained contextual bandit algorithm for limited-budget recommendation system
    Zhao, Yafei
    Yang, Long
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 128
  • [9] Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model
    Kim, Gi-Soo
    Paik, Myunghee Cho
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [10] BiUCB: A Contextual Bandit Algorithm for Cold-Start and Diversified Recommendation
    Wang, Lu
    Wang, Chengyu
    Wang, Keqiang
    He, Xiaofeng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (IEEE ICBK 2017), 2017, : 248 - 253