Diffusion policy: Visuomotor policy learning via action diffusion

被引:8
|
作者
Chi, Cheng [1 ]
Xu, Zhenjia [1 ]
Feng, Siyuan [2 ]
Cousineau, Eric [2 ]
Du, Yilun [3 ]
Burchfiel, Benjamin [2 ]
Tedrake, Russ [2 ,3 ]
Song, Shuran [1 ,4 ]
机构
[1] Columbia Univ, Comp Sci, New York, NY USA
[2] Toyota Res Inst, Palo Alto, CA USA
[3] MIT, EECS, Cambridge, MA USA
[4] Stanford Univ, Elect Engn, Stanford, CA USA
关键词
Imitation learning; visuomotor policy; manipulation;
D O I
10.1177/02783649241273668
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
This paper introduces Diffusion Policy, a new way of generating robot behavior by representing a robot's visuomotor policy as a conditional denoising diffusion process. We benchmark Diffusion Policy across 15 different tasks from 4 different robot manipulation benchmarks and find that it consistently outperforms existing state-of-the-art robot learning methods with an average improvement of 46.9%. Diffusion Policy learns the gradient of the action-distribution score function and iteratively optimizes with respect to this gradient field during inference via a series of stochastic Langevin dynamics steps. We find that the diffusion formulation yields powerful advantages when used for robot policies, including gracefully handling multimodal action distributions, being suitable for high-dimensional action spaces, and exhibiting impressive training stability. To fully unlock the potential of diffusion models for visuomotor policy learning on physical robots, this paper presents a set of key technical contributions including the incorporation of receding horizon control, visual conditioning, and the time-series diffusion transformer. We hope this work will help motivate a new generation of policy learning techniques that are able to leverage the powerful generative modeling capabilities of diffusion models. Code, data, and training details are available (diffusion-policy.cs.columbia.edu).
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
    Li, Xiang (xiangli8@cs.stonybrook.edu), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [2] Robust Policy Learning via Offline Skill Diffusion
    Kim, Woo Kyung
    Yoo, Minjong
    Woo, Honguk
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13177 - 13184
  • [3] Ideology and learning in policy diffusion
    Grossback, LJ
    Nicholson-Crotty, S
    Peterson, DAM
    AMERICAN POLITICS RESEARCH, 2004, 32 (05) : 521 - 545
  • [4] Policy learning, policy diffusion, and the making of a new order
    Meseguer, C
    ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE, 2005, 598 : 67 - 82
  • [5] Failures: Diffusion, Learning, and Policy Abandonment
    Volden, Craig
    STATE POLITICS & POLICY QUARTERLY, 2016, 16 (01) : 44 - 77
  • [6] A formal model of learning and policy diffusion
    Volden, Craig
    Ting, Michael M.
    Carpenter, Daniel P.
    AMERICAN POLITICAL SCIENCE REVIEW, 2008, 102 (03) : 319 - 332
  • [7] Policy Diffusion, Policy Learning and Local Politics: Macroprudential Policy in Hungary and Slovakia
    Mero, Katalin
    Piroska, Dora
    EUROPE-ASIA STUDIES, 2017, 69 (03) : 458 - 482
  • [8] Policy diffusion and policy transfer
    Marsh, David
    Sharman, J. C.
    POLICY STUDIES, 2009, 30 (03) : 269 - 288
  • [9] Rational learning and bounded learning in the diffusion of policy innovations
    Meseguer, C
    RATIONALITY AND SOCIETY, 2006, 18 (01) : 35 - 66
  • [10] Ideology, Learning, and Policy Diffusion: Experimental Evidence
    Butler, Daniel M.
    Volden, Craig
    Dynes, Adam M.
    Shor, Boris
    AMERICAN JOURNAL OF POLITICAL SCIENCE, 2017, 61 (01) : 37 - 49