Training parsers by inverse reinforcement learning

被引:27
|
作者
Neu, Gergely [1 ,2 ]
Szepesvari, Csaba [2 ,3 ]
机构
[1] Budapest Univ Technol & Econ, Dept Comp Sci, H-1111 Budapest, Hungary
[2] Hungarian Acad Sci, Comp & Automat Res Inst, H-1111 Budapest, Hungary
[3] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8, Canada
关键词
Reinforcement learning; Inverse reinforcement learning; Parsing; PCFG; Discriminative parser training; Parser training; Parsing as behavior;
D O I
10.1007/s10994-009-5110-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One major idea in structured prediction is to assume that the predictor computes its output by finding the maximum of a score function. The training of such a predictor can then be cast as the problem of finding weights of the score function so that the output of the predictor on the inputs matches the corresponding structured labels on the training set. A similar problem is studied in inverse reinforcement learning (IRL) where one is given an environment and a set of trajectories and the problem is to find a reward function such that an agent acting optimally with respect to the reward function would follow trajectories that match those in the training set. In this paper we show how IRL algorithms can be applied to structured prediction, in particular to parser training. We present a number of recent incremental IRL algorithms in a unified framework and map them to parser training algorithms. This allows us to recover some existing parser training algorithms, as well as to obtain a new one. The resulting algorithms are compared in terms of their sensitivity to the choice of various parameters and generalization ability on the Penn Treebank WSJ corpus.
引用
收藏
页码:303 / 337
页数:35
相关论文
共 50 条
  • [1] Training parsers by inverse reinforcement learning
    Gergely Neu
    Csaba Szepesvári
    [J]. Machine Learning, 2009, 77 : 303 - 337
  • [2] Repeated Inverse Reinforcement Learning
    Amin, Kareem
    Jiang, Nan
    Singh, Satinder
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [3] Cooperative Inverse Reinforcement Learning
    Hadfield-Menell, Dylan
    Dragan, Anca
    Abbeel, Pieter
    Russell, Stuart
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] Misspecification in Inverse Reinforcement Learning
    Skalse, Joar
    Abate, Alessandro
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 15136 - 15143
  • [5] Bayesian Inverse Reinforcement Learning
    Ramachandran, Deepak
    Amir, Eyal
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2586 - 2591
  • [6] Inverse Constrained Reinforcement Learning
    Malik, Shehryar
    Anwar, Usman
    Aghasi, Alireza
    Ahmed, Ali
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [7] Lifelong Inverse Reinforcement Learning
    Mendez, Jorge A.
    Shivkumar, Shashank
    Eaton, Eric
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [8] Inverse reinforcement learning with evaluation
    da Silva, Valdinei Freire
    Reali Costa, Anna Helena
    Lima, Pedro
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-10, 2006, : 4246 - +
  • [9] Identifiability in inverse reinforcement learning
    Cao, Haoyang
    Cohen, Samuel N.
    Szpruch, Lukasz
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [10] A survey of inverse reinforcement learning
    Adams, Stephen
    Cody, Tyler
    Beling, Peter A.
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (06) : 4307 - 4346