Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)

被引:0
|
作者
Huang Bojun [1 ]
机构
[1] Rakuten Grp Inc, Rakuten Inst Technol, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses a new approach to the fundamental problem of learning optimal Q-functions. In this approach, optimal Q-functions are formulated as saddle points of a nonlinear Lagrangian function derived from the classic Bellman optimality equation. The paper shows that the Lagrangian enjoys strong duality, in spite of its nonlinearity, which paves the way to a general Lagrangian method to Q-function learning. As a demonstration, the paper develops an imitation learning algorithm based on the duality theory, and applies the algorithm to a state-of-the-art machine translation benchmark. The paper then turns to demonstrate a symmetry breaking phenomenon regarding the optimality of the Lagrangian saddle points, which justifies a largely overlooked direction in developing the Lagrangian method.
引用
收藏
页数:31
相关论文
共 50 条
  • [1] A RECURSIVE METHOD OF COMPUTING THE Q-FUNCTION
    BRENNAN, LE
    REED, IS
    IEEE TRANSACTIONS ON INFORMATION THEORY, 1965, 11 (02) : 312 - 313
  • [2] On Approximation of Gaussian Q-Function and its Applications
    ShirinAbadi, Parnian A.
    Abbasi, Arash
    2019 IEEE 10TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2019, : 883 - 887
  • [3] Reinforcement learning via approximation of the Q-function
    Langlois, Marina
    Sloan, Robert H.
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2010, 22 (03) : 219 - 235
  • [5] Learning Optimal Q-Function Using Deep Boltzmann Machine for Reliable Trading of Cryptocurrency
    Bu, Seok-Jun
    Cho, Sung-Bae
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2018, PT I, 2018, 11314 : 468 - 480
  • [6] Solutions to Integrals Involving the Marcum Q-Function and Applications
    Sofotasios, Paschalis C.
    Muhaidat, Sami
    Karagiannidis, George K.
    Sharif, Bayan S.
    IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (10) : 1752 - 1756
  • [7] New Tight Bounds for the Gaussian Q-Function and Applications
    El Bouanani, Faissal
    Mouchtak, Yassine
    Karagiannidis, George K.
    IEEE ACCESS, 2020, 8 : 145037 - 145055
  • [8] Novel Approximation for the Gaussian Q-Function and Related Applications
    Shi, Qinghua
    2011 IEEE 22ND INTERNATIONAL SYMPOSIUM ON PERSONAL INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2011, : 2030 - 2034
  • [9] ON RAMANUJANS Q-FUNCTION
    FLAJOLET, P
    GRABNER, PJ
    KIRSCHENHOFER, P
    PRODINGER, H
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1995, 58 (01) : 103 - 116
  • [10] EVALUATION OF Q-FUNCTION
    SHNIDMAN, DA
    IEEE TRANSACTIONS ON COMMUNICATIONS, 1974, 22 (03) : 342 - 346