Dynamic Pricing and Learning with Discounting

被引:0
|
作者
Feng, Zhichao [1 ]
Dawande, Milind [2 ]
Janakiraman, Ganesh [2 ]
Qi, Anyan [2 ]
机构
[1] Hong Kong Polytech Univ, Fac Business, Dept Logist & Maritime Studies, Kowloon, Hong Kong, Peoples R China
[2] Univ Texas Dallas, Naveen Jindal Sch Management, Richardson, TX 75080 USA
基金
中国国家自然科学基金;
关键词
dynamic pricing; learning; discounting; regret minimization; DEMAND; CAPACITY;
D O I
10.1287/opre.2023.2477
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
In many practical settings, learning algorithms can take a substantial amount of time to converge, thereby raising the need to understand the role of discounting in learning. We illustrate the impact of discounting on the performance of learning algorithms by examining two classic and representative dynamic-pricing and learning problems studied in Broder and Rusmevichientong (BR) [Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965-980] and Keskin and Zeevi (KZ) [Keskin NB, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142-1167]. In both settings, a seller sells a product with unlimited inventory over T periods. The seller initially does not know the parameters of the general choice model in BR (respectively, the linear demand curve in KZ). Given a discount factor p, the retailer's objective is to determine a pricing policy to maximize the expected discounted revenue over T periods. In both settings, we establish lower bounds on the regret under any policy and show limiting bounds of ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffififfi & RADIC; ffiffiffi & OHM;( 1/(1 -p)) and & OHM;( T ) when T-, & INFIN; and p-,1, respectively. In the model of BR with discounting, we propose an asymptotically tight learning policy and show that the regret ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffififfi under our policy as well that under the MLE-CYCLE policy in BR is O( 1/(1- p)) (respec & RADIC;ffiffiffi tively, O(T)) when T-, & INFIN; (respectively, p-,1). In the model of KZ with discounting, we present sufficient conditions for a learning policy to guarantee asymptotic optimality and show that the regret under any policy satisfying these conditions is O(log(1/(1 -p)) that three different policies-namely, the two variants of the greedy iterated least squares policy in KZ and a different policy that we propose-achieve this upper bound on the regret. We numerically examine the behavior of the regret under our policies as well as those in BR and KZ in the presence of discounting. We also analyze a setting in which the discount factor per period is a function of the number of decision periods in the planning horizon. ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffififfi & RADIC; ffiffiffi 1/(1 -p)) (respectively, O(logT T )) when T-, & INFIN; (respectively, p-,1). We show
引用
收藏
页码:481 / 492
页数:13
相关论文
共 50 条
  • [1] The dynamic effect of discounting on sales: Empirical analysis and normative pricing implications
    Kopalle, PK
    Mela, CF
    Marsh, L
    [J]. MARKETING SCIENCE, 1999, 18 (03) : 317 - 332
  • [2] Dynamic pricing and reinforcement learning
    Carvalho, AX
    Puterman, ML
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2916 - 2921
  • [3] Reinforcement Learning for Fair Dynamic Pricing
    Maestre, Roberto
    Duque, Juan
    Rubio, Alberto
    Arevalo, Juan
    [J]. INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 120 - 135
  • [4] Dynamic Pricing by Multiagent Reinforcement Learning
    Han, Wei
    Liu, Lingbo
    Zheng, Huaili
    [J]. PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 226 - 229
  • [5] Dynamic Pricing and Learning with Bayesian Persuasion
    Agrawal, Shipra
    Feng, Yiding
    Tang, Wei
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Learning and optimizing through dynamic pricing
    Kumar R.
    Li A.
    Wang W.
    [J]. Journal of Revenue and Pricing Management, 2018, 17 (2) : 63 - 77
  • [7] Dynamic pricing in the presence of individual learning
    Weng, Xi
    [J]. JOURNAL OF ECONOMIC THEORY, 2015, 155 : 262 - 299
  • [8] Dynamic pricing and inventory control with learning
    Petruzzi, NC
    Dada, M
    [J]. NAVAL RESEARCH LOGISTICS, 2002, 49 (03) : 303 - 325
  • [9] Dynamic pricing and learning in electricity markets
    Garcia, A
    Campos-Nañez, E
    Reitzes, J
    [J]. OPERATIONS RESEARCH, 2005, 53 (02) : 231 - 241
  • [10] Dynamic Pricing and Learning with Finite Inventories
    den Boer, Arnoud V.
    Zwart, Bert
    [J]. OPERATIONS RESEARCH, 2015, 63 (04) : 965 - 978