Probabilistic Offline Policy Ranking with Approximate Bayesian Computation

被引:0
|
作者
Da, Longchao [1 ]
Jenkins, Porter [2 ]
Schwantes, Trevor [2 ]
Dotson, Jeffrey [2 ]
Wei, Hua [1 ]
机构
[1] Arizona State Univ, Tempe, AZ 85287 USA
[2] Brigham Young Univ, Provo, UT 84602 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In practice, it is essential to compare and rank candidate policies offline before real-world deployment for safety and reliability. Prior work seeks to solve this offline policy ranking (OPR) problem through value-based methods, such as Off-policy evaluation (OPE). However, they fail to analyze special case performance (e.g., worst or best cases), due to the lack of holistic characterization of policies' performance. It is even more difficult to estimate precise policy values when the reward is not fully accessible under sparse settings. In this paper, we present Probabilistic Offline Policy Ranking (POPR), a framework to address OPR problems by leveraging expert data to characterize the probability of a candidate policy behaving like experts, and approximating its entire performance posterior distribution to help with ranking. POPR does not rely on value estimation, and the derived performance posterior can be used to distinguish candidates in worst-, best-, and average-cases. To estimate the posterior, we propose POPR-EABC, an Energy-based Approximate Bayesian Computation (ABC) method conducting likelihood-free inference. POPR-EABC reduces the heuristic nature of ABC by a smooth energy function, and improves the sampling efficiency by a pseudo-likelihood. We empirically demonstrate that POPR-EABC is adequate for evaluating policies in both discrete and continuous action spaces across various experiment environments, and facilitates probabilistic comparisons of candidate policies before deployment.
引用
收藏
页码:20370 / 20378
页数:9
相关论文
共 50 条
  • [41] Approximate Bayesian computation in population genetics
    Beaumont, MA
    Zhang, WY
    Balding, DJ
    GENETICS, 2002, 162 (04) : 2025 - 2035
  • [42] Filtering via approximate Bayesian computation
    Ajay Jasra
    Sumeetpal S. Singh
    James S. Martin
    Emma McCoy
    Statistics and Computing, 2012, 22 : 1223 - 1237
  • [43] Approximate Bayesian computation with the Wasserstein distance
    Bernton, Espen
    Jacob, Pierre E.
    Gerber, Mathieu
    Robert, Christian P.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2019, 81 (02) : 235 - 269
  • [44] Approximate Bayesian computation with functional statistics
    Soubeyrand, Samuel
    Carpentier, Florence
    Guiton, Francois
    Klein, Etienne K.
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2013, 12 (01) : 17 - 37
  • [45] Approximate Bayesian Computation: A Nonparametric Perspective
    Blum, Michael G. B.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (491) : 1178 - 1187
  • [46] New insights into Approximate Bayesian Computation
    Biau, Gerard
    Cerou, Frederic
    Guyader, Arnaud
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2015, 51 (01): : 376 - 403
  • [47] Probabilistic load flow method using approximate Bayesian computation and Markov chain Monte Carlo
    Gao F.
    Yuan C.
    Li Z.
    Qi X.
    Zhuang S.
    Taiyangneng Xuebao/Acta Energiae Solaris Sinica, 2021, 42 (11): : 265 - 272
  • [48] Bayesian Sensitivity Analysis for Offline Policy Evaluation
    Jung, Jongbin
    Shroff, Ravi
    Feller, Avi
    Goel, Sharad
    PROCEEDINGS OF THE 3RD AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY AIES 2020, 2020, : 64 - 70
  • [49] Adaptive approximate Bayesian computation for complex models
    Lenormand, Maxime
    Jabot, Franck
    Deffuant, Guillaume
    COMPUTATIONAL STATISTICS, 2013, 28 (06) : 2777 - 2796
  • [50] Multilevel Monte Carlo in approximate Bayesian computation
    Jasra, Ajay
    Jo, Seongil
    Nott, David
    Shoemaker, Christine
    Tempone, Raul
    STOCHASTIC ANALYSIS AND APPLICATIONS, 2019, 37 (03) : 346 - 360