A Smart Dual-modal Aligned Transformer Deep Network for Robotic Grasp Detection

被引：0

作者：

Cang, Xin ^{[1
]}

Zhang, Haojun ^{[1
]}

Yang, Yuequan ^{[1
]}

Cao, Zhiqiang ^{[2
]}

Li, Fudong ^{[1
]}

Zhu, Jiaming ^{[1
]}

机构：

[1] Yangzhou Univ, Sch Informat Engn, Yangzhou, Jiangsu, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

来源：

2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

Dual modalities; Feature alignment; Robotic grasping; Transformer;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Robotic grasp is one of crucial visual tasks for service robots as well as industrial robots. The existing deep vision learning approaches for robotic grasp most utilize RGB-D as single modality or indiscriminating usage of them, which often overlook the valuable depth information in RGB-D images. To address this limitation, this paper proposes a smart dual-modal aligned transformer deep network (SATNet), which is not only very lightweight but also well performed for robotic grasping tasks using RGB-D images. Specifically, a novel ATFormer module with the two parallel aligned transformer encoder blocks are elaborated to fuse global feature maps efficiently. The experiments on Cornell dataset demonstrate that the proposed model outperforms existing methods, which not only enjoys impressively lightweight framework with only 0.27M parameters, but also achieves accuracy of 97.8% and inference time of 16.3ms.

引用

页码：1230 / 1235

页数：6

共 50 条

[31] Sediment grain segmentation in thin-section images using dual-modal Vision Transformer
Zheng, Dongyu
Hou, Li
Hu, Xiumian
Hou, Mingcai
Dong, Kai
Hu, Sihai
Teng, Runlin
Ma, Chao
COMPUTERS & GEOSCIENCES, 2024, 191
[32] A Dual-Modal Fusion Network Using Optical Coherence Tomography and Fundus Images in Detection of Glaucomatous Optic Neuropathy
Xu, Yongli
Sun, Run
Hu, Man
Zeng, Hui
CURRENT EYE RESEARCH, 2024, 49 (12) : 1253 - 1259
[33] A smart bioresponsive nanosystem with dual-modal imaging for drug visual loading and targeted delivery
Peng, Jingyi
Gong, Peiwei
Li, Shuohan
Kong, Fei
Ge, Xingxing
Wang, Bin
Guo, Lihua
Liu, Zhe
You, Jinmao
Chemical Engineering Journal, 2020, 391
[34] Dual-Modal Transformer with Enhanced Inter- and Intra-Modality Interactions for Image Captioning
Kumar, Deepika
Srivastava, Varun
Popescu, Daniela Elena
Hemanth, Jude D.
APPLIED SCIENCES-BASEL, 2022, 12 (13):
[35] A dual-modal graph attention interaction network for person Re-identification
Wang, Wen
An, Gaoyun
Ruan, Qiuqi
IET COMPUTER VISION, 2023, 17 (06) : 687 - 699
[36] End-to-End lightweight Transformer-Based neural network for grasp detection towards fruit robotic handling
Guo, Congmin
Zhu, Chenhao
Liu, Yuchen
Huang, Renjun
Cao, Boyuan
Zhu, Qingzhen
Zhang, Ranxin
Zhang, Baohua
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 221
[37] Dual-Modal Fluorescent Hyperspectral Micro-CT for Precise Bioimaging Detection
Luo, Jing
Zhu, He
Janjua, Raheel Ahmed
Ji, Wenbin
Zhang, Ruili
Liang, Junbo
He, Sailing
PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER, 2024, 181 : 73 - 80
[38] Low-observable targets detection for autonomous vehicles based on dual-modal sensor fusion with deep learning approach
Geng, Keke
Zou, Wei
Yin, Guodong
Li, Yang
Zhou, Zihao
Yang, Fan
Wu, Yuan
Shen, Cheng
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING, 2019, 233 (09) : 2270 - 2283
[39] Dual-modal cancer detection based on optical pH sensing and Raman spectroscopy
Kim, Soogeun
Lee, Seung Ho
Min, Sun Young
Byun, Kyung Min
Lee, Soo Yeol
JOURNAL OF BIOMEDICAL OPTICS, 2017, 22 (10)
[40] Cross-Domain Rumor Detection based on Dual-Modal Domain Alignment
Liu, Danni
Liu, Bo
Chen, Yida
Wu, Wanmeng
Cao, Jiuxin
Hou, Yiwen
2024 9TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, ICSIP, 2024, : 544 - 548

← 1 2 3 4 5 →