Reinforcement learning for Multi-Flight Dynamic Pricing

被引：0

作者：

Zhu, Xinghui ^{[1
]}

Jian, Lulu ^{[1
]}

Chen, Xin ^{[2
]}

Zhao, Qian ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Civil Aviat, Nanjing 211106, Peoples R China

[2] Nanjing Univ Finance & Econ, Sch Management Sci & Engn, 3 Wenyuan Rd, Nanjing, Jiangsu, Peoples R China

来源：

COMPUTERS & INDUSTRIAL ENGINEERING | 2024年 / 193卷

关键词：

Dynamic pricing; Reinforcement learning; Multi-flight pricing; Multi-nominal logit; REVENUE MANAGEMENT; PERISHABLE PRODUCTS; STOCHASTIC DEMAND; YIELD-MANAGEMENT; FARE CLASSES; MODEL; INVENTORY; ALGORITHM;

D O I：

10.1016/j.cie.2024.110302

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Dynamic pricing is essential for airline revenue management, requiring quick adaptation to fluctuating market environments and complex customer behaviors. This study addresses the Multi-Flight Dynamic Pricing (MFDP) problem, which presents unique challenges due to interdependent demand between multiple flights and high dimensionality. Traditional studies often assume that the demand function modeling customer behavior is either known in advance or follows a predefined structure, failing to capture the dynamic nature of pricing decisions. To fill this gap, we develop deep reinforcement learning (DRL) algorithms-Deep Q-Network (DQN), Advantage Actor-Critic (A2C), Proximal Policy Optimization (PPO), and Trust Region Policy Optimization (TRPO). By formulating the MFDP problem as a Markov Decision Process (MDP), we design an innovative utility function for the Multinomial Logit (MNL) model that captures realistic features of the airline market, such as competition from high-speed rail, the effect of reference fares, and travel time. We compare the performance of our DRL algorithms with traditional algorithms, including Dynamic Programming (DP), Price Pooling (PP), Inventory Pooling (IP), and Inventory and Price Pooling (IPP). Our experiments demonstrate that DRL algorithms alleviate the curse of dimensionality faced by traditional algorithms, expedite the learning process, and deliver satisfactory performance without relying on predefined demand functions. Among these algorithms, TRPO shows superior performance, achieving 99% of the theoretical optimal revenue, proving its adaptability and stability in dynamic pricing applications. We also highlight the importance of considering the null price in the action space of MFDP problems. The larger the market scale, the more pronounced the effect of the null price in accelerating RL algorithm convergence, leading to more efficient computational resource utilization.

引用

页数：16

共 50 条

[1] Dynamic pricing and reinforcement learning
Carvalho, AX
Puterman, ML
[J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2916 - 2921
[2] Dynamic Pricing by Multiagent Reinforcement Learning
Han, Wei
Liu, Lingbo
Zheng, Huaili
[J]. PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 226 - 229
[3] Reinforcement Learning for Fair Dynamic Pricing
Maestre, Roberto
Duque, Juan
Rubio, Alberto
Arevalo, Juan
[J]. INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 120 - 135
[4] Learning Dynamic Pricing Rules for Flight Tickets
Cao, Jian
Liu, Zeling
Wu, Yao
[J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT I, 2020, 12274 : 498 - 505
[5] Application of Reinforcement Learning in Dynamic Pricing Algorithms
Wang Jintian
Zhou Lei
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS ( ICAL 2009), VOLS 1-3, 2009, : 419 - 423
[6] Dynamic Pricing for Smart Grid with Reinforcement Learning
Kim, Byung-Gook
Zhang, Yu
van der Schaar, Mihaela
Lee, Jang-Won
[J]. 2014 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2014, : 640 - 645
[7] Approach for Dynamic Flight Pricing Based on Strategy Learning
Lu Min
Zhang Yaoyuan
Lu Chun
[J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (04) : 1022 - 1028
[8] Reinforcement Learning for Adaptive Caching With Dynamic Storage Pricing
Sadeghi, Alireza
Sheikholeslami, Fatemeh
Marques, Antonio G.
Giannakis, Georgios B.
[J]. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (10) : 2267 - 2281
[9] Dynamic Pricing and Energy Consumption Scheduling With Reinforcement Learning
Kim, Byung-Gook
Zhang, Yu
van der Schaar, Mihaela
Lee, Jang-Won
[J]. IEEE TRANSACTIONS ON SMART GRID, 2016, 7 (05) : 2187 - 2198
[10] Dynamic pricing under competition using reinforcement learning
Alexander Kastius
Rainer Schlosser
[J]. Journal of Revenue and Pricing Management, 2022, 21 : 50 - 63

← 1 2 3 4 5 →