Hedging using reinforcement learning: Contextual k-armed bandit versus Q-learning

被引:3
|
作者
Cannelli, Loris [1 ,4 ]
Nuti, Giuseppe [2 ]
Sala, Marzio [3 ]
Szehr, Oleg [1 ]
机构
[1] USI, Dalle Molle Inst Artificial Intelligence IDSIA, SUPSI, Lugano, Switzerland
[2] UBS Investment Bank, New York, NY USA
[3] UBS Investment Bank, Zurich, Switzerland
[4] Via La St 1, CH-6962 Lugano, Switzerland
来源
关键词
Hedging; Reinforcement Learning; Q; -Learning; Multi-Armed Bandits; GAME; GO;
D O I
10.1016/j.jfds.2023.100101
中图分类号
F8 [财政、金融];
学科分类号
0202 ;
摘要
The construction of replication strategies for contingent claims in the presence of risk and market friction is a key problem of financial engineering. In real markets, continuous replication, such as in the model of Black, Scholes and Merton (BSM), is not only unrealistic but is also undesirable due to high transaction costs. A variety of methods have been proposed to balance between effective replication and losses in the incomplete market setting. With the rise of Artificial Intelligence (AI), AI-based hedgers have attracted considerable interest, where particular attention is given to Recurrent Neural Network systems and variations of the Q-learning algorithm. From a practical point of view, sufficient samples for training such an AI can only be obtained from a simulator of the market environment. Yet if an agent is trained solely on simulated data, the run-time performance will primarily reflect the accuracy of the simulation, which leads to the classical problem of model choice and calibration. In this article, the hedging problem is viewed as an instance of a risk-averse contextual k-armed bandit problem, which is motivated by the simplicity and sampleefficiency of the architecture, which allows for realistic online model updates from real-world data. We find that the k-armed bandit model naturally fits to the Profit and Loss formulation of hedging, providing for a more accurate and sample efficient approach than Q-learning and reducing to the Black-Scholes model in the absence of transaction costs and risks.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Emergency-Response Locomotion of Hexapod Robot with Heuristic Reinforcement Learning Using Q-Learning
    Yang, Ming-Chieh
    Samani, Hooman
    Zhu, Kening
    INTERACTIVE COLLABORATIVE ROBOTICS (ICR 2019), 2019, 11659 : 320 - 329
  • [42] Generating Learning Sequences Using Contextual Bandit Algorithms
    Le Minh Duc Nguyen
    Fuhua Lin
    Chang, Maiga
    GENERATIVE INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, PT I, ITS 2024, 2024, 14798 : 320 - 329
  • [43] An Online Home Energy Management System using Q-Learning and Deep Q-Learning
    Izmitligil, Hasan
    Karamancioglu, Abdurrahman
    SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2024, 43
  • [44] Reinforcement distribution in a team of cooperative Q-learning agents
    Abbasi, Zahra
    Abbasi, Mohammad Ali
    PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, : 154 - +
  • [45] The Sample Complexity of Teaching-by-Reinforcement on Q-Learning
    Zhang, Xuezhou
    Bharti, Shubham Kumar
    Ma, Yuzhe
    Singla, Adish
    Zhu, Xiaojin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10939 - 10947
  • [46] HEDGING BARRIER OPTIONS USING REINFORCEMENT LEARNING
    Chen, Jacky
    Fu, Yu
    Hull, John
    Poulos, Zissis
    Wang, Zeyu
    Yuan, Jun
    JOURNAL OF INVESTMENT MANAGEMENT, 2024, 22 (04): : 16 - 25
  • [47] Reinforcement Learning for Automatic Parameter Tuning in Apache Spark: A Q-Learning Approach
    Deng, Mei
    Huang, Zirui
    Ren, Zhigang
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 13 - 18
  • [48] Deep Q-Learning Based Reinforcement Learning Approach for Network Intrusion Detection
    Alavizadeh, Hooman
    Alavizadeh, Hootan
    Jang-Jaccard, Julian
    COMPUTERS, 2022, 11 (03)
  • [49] Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning
    Omura, Motoki
    Osa, Takayuki
    Mukuta, Yusuke
    Harada, Tatsuya
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14474 - 14481
  • [50] Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
    Shi, Laixi
    Li, Gen
    Wei, Yuting
    Chen, Yuxin
    Chi, Yuejie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,