Visual Sorting Method Based on Multi-Modal Information Fusion

被引:2
|
作者
Han, Song [1 ]
Liu, Xiaoping [1 ]
Wang, Gang [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Modern Post, Beijing 100876, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 06期
关键词
multi-modal; self-attention; Swin Transformer; depth estimation; robot sorting;
D O I
10.3390/app12062946
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Visual sorting of stacked parcels is a key issue in intelligent logistics sorting systems. In order to improve the sorting success rate of express parcels and effectively obtain the sorting order of express parcels, a visual sorting method based on multi-modal information fusion (VS-MF) is proposed in this paper. Firstly, an object detection network based on multi-modal information fusion (OD-MF) is proposed. The global gradient feature is extracted from depth information as a self-attention module. More spatial features are learned by the network, and the detection accuracy is improved significantly. Secondly, a multi-modal segmentation network based on Swin Transformer (MS-ST) is proposed to detect the optimal sorting positions and poses of parcels. More fine-grained information of the sorting parcels and the relationships between them are gained by adding Swin Transformer models. Frequency domain information and depth information are used as supervision signals to obtain the pickable areas and infer the occlusion degrees of parcels. A strategy for the optimal sorting order is also proposed to ensure the stability of the system. Finally, a sorting system with a 6-DOF robot is constructed to complete the sorting task of stacked parcels. The accuracy and stability the system are verified by sorting experiments.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Colour image cross-modal retrieval method based on multi-modal visual data fusion
    Liu, Xiangyuan
    [J]. International Journal of Computational Intelligence Studies, 2023, 12 (1-2) : 118 - 129
  • [2] A Spam Filtering Method Based on Multi-Modal Fusion
    Yang, Hong
    Liu, Qihe
    Zhou, Shijie
    Luo, Yang
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (06):
  • [3] News video classification based on multi-modal information fusion
    Lie, WN
    Su, CK
    [J]. 2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 1021 - 1024
  • [4] Multi-Modal Fusion Technology Based on Vehicle Information: A Survey
    Zhang, Xinyu
    Gong, Yan
    Lu, Jianli
    Wu, Jiayi
    Li, Zhiwei
    Jin, Dafeng
    Li, Jun
    [J]. IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (06): : 3605 - 3619
  • [5] A Train Driver Fatigue Driving Detection Method Based on Multi-modal Information Fusion
    Li, Xiaoping
    Bai, Chao
    [J]. Tiedao Xuebao/Journal of the China Railway Society, 2022, 44 (06): : 56 - 65
  • [6] Evaluation Method of Teaching Styles Based on Multi-modal Fusion
    Tang, Wen
    Wang, Chongwen
    Zhang, Yi
    [J]. 2021 THE 7TH INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING, ICCIP 2021, 2021, : 9 - 15
  • [7] ART-Based Fusion of Multi-modal Information for Mobile Robots
    Berghoefer, Elmar
    Schulze, Denis
    Tscherepanow, Marko
    Wachsmuth, Sven
    [J]. ENGINEERING APPLICATIONS OF NEURAL NETWORKS, PT I, 2011, 363 : 1 - 10
  • [8] Multi-modal Fusion
    Liu, Huaping
    Hussain, Amir
    Wang, Shuliang
    [J]. INFORMATION SCIENCES, 2018, 432 : 462 - 462
  • [9] Fusion of auxiliary information for multi-modal biometrics authentication
    Toh, KA
    Yau, WY
    Lim, E
    Chen, L
    Ng, CH
    [J]. BIOMETRIC AUTHENTICATION, PROCEEDINGS, 2004, 3072 : 678 - 685
  • [10] MULTI-MODAL INFORMATION FUSION FOR CLASSIFICATION OF KIDNEY ABNORMALITIES
    Varsha, S.
    Nasser, Sahar Almahfouz
    Bala, Gouranga
    Kurian, Nikhil Cherian
    Sethi, Amit
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING CHALLENGES (IEEE ISBI 2022), 2022,