DARL: Distributed Reconfigurable Accelerator for Hyperdimensional Reinforcement Learning

被引:13
|
作者
Chen, Hanning [1 ]
Issa, Mariam [1 ]
Ni, Yang [1 ]
Imani, Mohsen [1 ]
机构
[1] Univ Calif Irvine, Irvine, CA 92717 USA
基金
美国国家科学基金会;
关键词
D O I
10.1145/3508352.3549437
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Reinforcement Learning (RL) is a powerful technology to solve decisionmaking problems such as robotics control. Modern RL algorithms, i.e., Deep Q-Learning, are based on costly and resource hungry deep neural networks. This motivates us to deploy alternative models for powering RL agents on edge devices. Recently, brain-inspired HyperDimensional Computing (HDC) has been introduced as a promising solution for lightweight and efficient machine learning, particularly for classification. In this work, we develop a novel platform capable of real-time hyper dimensional reinforcement learning. Our heterogeneous CPU-FPGA platform, called DARL, maximizes FPGA's computing capabilities by applying hardware optimizations to hyperdimensional computing's critical operations, including hardware -friendly encoder IP, the hypervector chunk fragmentation, and the delayed model update. Aside from hardware innovation, we also extend the platform to basic single agent RL to support multi-agents distributed learning. We evaluate the effectiveness of our approach on OpenAl Gym tasks. Our results show that the FPGA platform provides on average 20x speedup compared to current state-of-the-art hyperdimensional RL methods running on Intel Xeon 6226 CPU. In addition, DARL provides around 4.8x faster and 4.2x higher energy efficiency compared to the state-of-the-art RL accelerator While ensuring a better or comparable quality of learning.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] DARL1N: Distributed multi-Agent Reinforcement Learning with One-hop Neighbors
    Wang, Baoqian
    Xie, Junfei
    Atanasov, Nikolay
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 9003 - 9010
  • [2] DARL: Distance-Aware Uncertainty Estimation for Offline Reinforcement Learning
    Zhang, Hongchang
    Shao, Jianzhun
    He, Shuncheng
    Jiang, Yuhang
    Ji, Xiangyang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11210 - 11218
  • [3] Efficient Exploration in Edge-Friendly Hyperdimensional Reinforcement Learning
    Ni, Yang
    Chung, William Youngwoo
    Cho, Samuel
    Zou, Zhuowen
    Imani, Mohsen
    PROCEEDING OF THE GREAT LAKES SYMPOSIUM ON VLSI 2024, GLSVLSI 2024, 2024, : 111 - 118
  • [4] DISTRIBUTED REINFORCEMENT LEARNING
    WEISS, G
    ROBOTICS AND AUTONOMOUS SYSTEMS, 1995, 15 (1-2) : 135 - 142
  • [5] Distributed reinforcement learning
    Weiss, Gerhard, 1600, Elsevier Science B.V., Amsterdam, Netherlands (15):
  • [6] HDPG: Hyperdimensional Policy-based Reinforcement Learning for Continuous Control
    Ni, Yang
    Issa, Mariam
    Abraham, Danny
    Imani, Mandi
    Yin, Xunzhao
    Imani, Mohsen
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 1141 - 1146
  • [7] FRL: Fast and Reconfigurable Accelerator for Distributed Sound Source Localization
    Ding, Xiaofeng
    Wang, Chengliang
    Liu, Heping
    Zhang, Zhihai
    Chen, Xianzhang
    Tan, Yujuan
    Liu, Duo
    Ren, Ao
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 3922 - 3933
  • [8] Distributed Offline Reinforcement Learning
    Heredia, Paulo
    George, Jemin
    Mou, Shaoshuai
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4621 - 4626
  • [9] A General Purpose Hyperdimensional Computing Accelerator for Edge Computing
    Asghari, Mohsen
    Le Beux, Sebastien
    2024 22ND IEEE INTERREGIONAL NEWCAS CONFERENCE, NEWCAS 2024, 2024, : 383 - 387
  • [10] Hardware Accelerator for Capsule Network based Reinforcement Learning
    Ram, Dola
    Panwar, Suraj
    Varghese, Kuruvilla
    2022 35TH INTERNATIONAL CONFERENCE ON VLSI DESIGN (VLSID 2022) HELD CONCURRENTLY WITH 2022 21ST INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (ES 2022), 2022, : 162 - 167