HTC-Grasp: A Hybrid Transformer-CNN Architecture for Robotic Grasp Detection

被引:4
|
作者
Zhang, Qiang [1 ]
Zhu, Jianwei [1 ]
Sun, Xueying [1 ]
Liu, Mingmin [2 ]
机构
[1] Jiangsu Univ Sci & Technol, Sch Automat, 666 Changhui Rd, Zhenjiang 212100, Peoples R China
[2] SIASUN Robot & Automat Co Ltd, Cent Res Inst, 16 Jinhui St, Shenyang 110168, Peoples R China
基金
中国国家自然科学基金;
关键词
robotic grasp; transformer; attentional mechanism;
D O I
10.3390/electronics12061505
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurately detecting suitable grasp areas for unknown objects through visual information remains a challenging task. Drawing inspiration from the success of the Vision Transformer in vision detection, the hybrid Transformer-CNN architecture for robotic grasp detection, known as HTC-Grasp, is developed to improve the accuracy of grasping unknown objects. The architecture employs an external attention-based hierarchical Transformer as an encoder to effectively capture global context and correlation features across the entire dataset. Furthermore, a channel-wise attention-based CNN decoder is presented to adaptively adjust the weight of the channels in the approach, resulting in more efficient feature aggregation. The proposed method is validated on the Cornell and the Jacquard dataset, achieving an image-wise detection accuracy of 98.3% and 95.8% on each dataset, respectively. Additionally, the object-wise detection accuracy of 96.9% and 92.4% on the same datasets are achieved based on this method. A physical experiment is also performed using the Elite 6Dof robot, with a grasping accuracy rate of 93.3%, demonstrating the proposed method's ability to grasp unknown objects in real scenarios. The results of this study indicate that the proposed method outperforms other state-of-the-art methods.
引用
下载
收藏
页数:16
相关论文
共 50 条
  • [1] Robotic Grasp Detection Based on Transformer
    Dong, Mingshuai
    Bai, Yuxuan
    Wei, Shimin
    Yu, Xiuli
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT IV, 2022, 13458 : 437 - 448
  • [2] Robotic Grasp Detection by Rotation Region CNN
    Lin, Hsien-, I
    Chu, Hong-Qi
    2021 IEEE 19TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2021,
  • [3] TChange: A Hybrid Transformer-CNN Change Detection Network
    Deng, Yupeng
    Meng, Yu
    Chen, Jingbo
    Yue, Anzhi
    Liu, Diyou
    Chen, Jing
    REMOTE SENSING, 2023, 15 (05)
  • [4] A novel hybrid transformer-CNN architecture for environmental microorganism classification
    Shao, Ran
    Bi, Xiao-Jun
    Chen, Zheng
    PLOS ONE, 2022, 17 (11):
  • [5] HBGNet: Robotic Grasp Detection Using a Hybrid Network
    Zuo, Guoyu
    Shen, Zhihui
    Yu, Shuangyue
    Luo, Yongkang
    Zhao, Min
    IEEE Transactions on Instrumentation and Measurement, 74
  • [6] HyFormer: a hybrid transformer-CNN architecture for retinal OCT image segmentation
    Jiang, Qingxin
    Fan, Ying
    Li, Menghan
    Fang, Sheng
    Zhu, Weifang
    Xiang, Dehui
    Peng, Tao
    Chen, Xinjian
    Xu, Xun
    Shi, Fei
    Biomedical Optics Express, 2024, 15 (11) : 6156 - 6170
  • [7] CorFormer: a hybrid transformer-CNN architecture for corrosion segmentation on metallic surfaces
    Abhishek Subedi
    Cheng Qian
    Reza Sadeghian
    Mohammad R. Jahanshahi
    Machine Vision and Applications, 2025, 36 (2)
  • [8] Robotic Grasp Detection Using Light-weight CNN Model
    Jiang, Yang
    Li, Xulong
    Yu, Minghao
    Bai, Zhongyu
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 1034 - 1038
  • [9] Cross-modal interaction fusion grasping detection based on Transformer-CNN hybrid architecture
    Wang, Yong
    Li, Yi-Ling
    Miao, Duo-Qian
    An, Chun-Yan
    Yuan, Xin-Lin
    Kongzhi yu Juece/Control and Decision, 2024, 39 (11): : 3607 - 3616
  • [10] Transformer-CNN hybrid network for crowd counting
    Yu J.
    Yu Y.
    Qian J.
    Han X.
    Zhu F.
    Zhu Z.
    Journal of Intelligent and Fuzzy Systems, 2024, 46 (04): : 10773 - 10785