Counterfactual Data-Fusion for Online Reinforcement Learners

被引:0
|
作者
Forney, Andrew [1 ]
Pearl, Judea [1 ]
Bareinboim, Elias [2 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA 90095 USA
[2] Purdue Univ, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
MULTIARMED BANDIT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Multi-Armed Bandit problem with Unobserved Confounders (MABUC) considers decision-making settings where unmeasured variables can influence both the agent's decisions and received rewards (Bareinboim et al., 2015). Recent findings showed that unobserved confounders (UCs) pose a unique challenge to algorithms based on standard randomization (i.e., experimental data); if UCs are naively averaged out, these algorithms behave sub-optimally, possibly incurring infinite regret. In this paper, we show how counterfactual-based decision-making circumvents these problems and leads to a coherent fusion of observational and experimental data. We then demonstrate this new strategy in an enhanced Thompson Sampling bandit player, and support our findings' efficacy with extensive simulations.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Data-Fusion in Geotechnical Applications
    Noordam, Aron
    Coelho, Bruno Zuada
    Teixeira, Ana
    Venmans, Arjan
    [J]. INFORMATION TECHNOLOGY IN GEO-ENGINEERING, 2020, : 365 - 375
  • [2] Information gained by Data-Fusion
    Arndt, C
    Loffeld, O
    [J]. VISION SYSTEMS: SENSORS, SENSOR SYSTEMS, AND COMPONENTS, 1996, 2784 : 32 - 40
  • [3] Data-fusion techniques and its application
    Dong, Hairong
    Evans, David
    [J]. FOURTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2007, : 442 - +
  • [4] The design and precision of data-fusion studies
    Sharot, Trevor
    [J]. INTERNATIONAL JOURNAL OF MARKET RESEARCH, 2007, 49 (04) : 449 - 470
  • [5] Causal inference and the data-fusion problem
    Bareinboim, Elias
    Pearl, Judea
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (27) : 7345 - 7352
  • [6] A calibration approach to transportability and data-fusion with observational data
    Josey, Kevin P.
    Yang, Fan
    Ghosh, Debashis
    Raghavan, Sridharan
    [J]. STATISTICS IN MEDICINE, 2022, 41 (23) : 4511 - 4531
  • [7] A data-fusion approach to partially supervised classification
    Prieto, DF
    Arino, O
    [J]. IGARSS 2001: SCANNING THE PRESENT AND RESOLVING THE FUTURE, VOLS 1-7, PROCEEDINGS, 2001, : 858 - 860
  • [8] A data-fusion approach to motion-stereo
    Malapelle, Francesco
    Fusiello, Andrea
    Rossi, Beatrice
    Fragneto, Pasqualina
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2016, 43 : 42 - 53
  • [9] Data-Fusion in Clustering Microarray Data: Balancing Discovery and Interpretability
    Kustra, Rafal
    Zagdanski, Adam
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2010, 7 (01) : 50 - 63
  • [10] Data-Fusion Approaches and Applications for Construction Engineering
    Shahandashti, Seyed Mohsen
    Razavi, Saiedeh N.
    Soibelman, Lucio
    Berges, Mario
    Caldas, Carlos H.
    Brilakis, Ioannis
    Teizer, Jochen
    Vela, Patricio A.
    Haas, Carl
    Garrett, James
    Akinci, Burcu
    Zhu, Zhenhua
    [J]. JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2011, 137 (10) : 863 - 869