Towards real-time embodied AI agent: a bionic visual encoding framework for mobile robotics

被引:0
|
作者
Hou, Xueyu [1 ]
Guan, Yongjie [1 ]
Han, Tao [2 ]
Wang, Cong [2 ]
机构
[1] Univ Maine, ECE Dept, Orono, ME 04469 USA
[2] New Jersey Inst Technol, ECE Dept, Newark, NJ USA
关键词
Mobile robotics; Visual encoding; Embodied AI; Computer vision; ICONIC MEMORY;
D O I
10.1007/s41315-024-00363-w
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Embodied artificial intelligence (AI) agents, which navigate and interact with their environment using sensors and actuators, are being applied for mobile robotic platforms with limited computing power, such as autonomous vehicles, drones, and humanoid robots. These systems make decisions through environmental perception from deep neural network (DNN)-based visual encoders. However, the constrained computational resources and the large amounts of visual data to be processed can create bottlenecks, such as taking almost 300 milliseconds per decision on an embedded GPU board (Jetson Xavier). Existing DNN acceleration methods need model retraining and can still reduce accuracy. To address these challenges, our paper introduces a bionic visual encoder framework, }Robye\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf \small {Robye}$$\end{document}, to support real-time requirements of embodied AI agents. The proposed framework complements existing DNN acceleration techniques. Specifically, we integrate motion data to identify overlapping areas between consecutive frames, which reduces DNN workload by propagating encoding results. We bifurcate processing into high-resolution for task-critical areas and low-resolution for less-significant regions. This dual-resolution approach allows us to maintain task performance while lowering the overall computational demands. We evaluate }Robye\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf \small {Robye}$$\end{document} across three robotic scenarios: autonomous driving, vision-and-language navigation, and drone navigation, using various DNN models and mobile platforms. }Robye\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf \small {Robye}$$\end{document} outperforms baselines in speed (1.2-3. 3 x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document}), performance (+4%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+4\%$$\end{document} to +29%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+29\%$$\end{document}), and power consumption (-36%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-36\%$$\end{document} to -47%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-47\%$$\end{document}).
引用
下载
收藏
页码:1038 / 1056
页数:19
相关论文
共 50 条
  • [41] Real-Time Head Pose Estimation Framework for Mobile Devices
    Kim, Jin
    Lee, Gyun Hyuk
    Jung, Jason J.
    Choi, Kwang Nam
    MOBILE NETWORKS & APPLICATIONS, 2017, 22 (04): : 634 - 641
  • [42] Adaptive Scheduling Framework for Real-Time Video Encoding on Heterogeneous Systems
    Ilic, Aleksandar
    Momcilovic, Svetislav
    Roma, Nuno
    Sousa, Leonel
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2016, 26 (03) : 597 - 611
  • [43] A Pervasive Framework for Real-Time Activity Patterns of Mobile Users
    Shen, Feichen
    2015 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATION WORKSHOPS (PERCOM WORKSHOPS), 2015, : 248 - 250
  • [44] Real-Time Head Pose Estimation Framework for Mobile Devices
    Jin Kim
    Gyun Hyuk Lee
    Jason J. Jung
    Kwang Nam Choi
    Mobile Networks and Applications, 2017, 22 : 634 - 641
  • [45] Towards Real-Time Monocular Depth Estimation For Mobile Systems
    Deldjoo, Yashar
    Di Noia, Tommaso
    Di Sciascio, Eugenio
    Pernisco, Gaetano
    Reno, Vito
    Stella, Ettore
    MULTIMODAL SENSING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS II, 2021, 11785
  • [46] Towards a real-time data sharing system for mobile devices
    Bagale, Jiva N.
    Shiyanbola, Abdurrahman
    Moore, John P. T.
    Kheirkhahzadeh, Antonio D.
    2014 EIGHTH INTERNATIONAL CONFERENCE ON NEXT GENERATION MOBILE APPS, SERVICES AND TECHNOLOGIES (NGMAST), 2014, : 147 - 152
  • [47] An Adaptive Framework for Real-Time ECG Transmission in Mobile Environments
    Kang, Kyungtae
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [48] Research on the real-time business intelligence framework of ROA mobile
    Zhu Zebo
    Chen Jian
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND ENGINEERING INNOVATION, 2015, 12 : 1555 - 1559
  • [49] Towards Distributed Real-Time Physiological Processing in Mobile Environments
    Meneghello, James
    Lee, Kevin
    Gilleade, Kiel
    2012 IEEE 23RD INTERNATIONAL SYMPOSIUM ON PERSONAL INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2012, : 2524 - 2529
  • [50] A Real-time Networked Control Framework Based on Mobile Phones
    Chen, Dongliang
    Liu, Guoping
    2015 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, 2015, : 1772 - 1778