MultiCAD: Contrastive Representation Learning for Multi-modal 3D Computer-Aided Design Models

被引:5
|
作者
Ma, Weijian [1 ]
Xu, Minyang [1 ]
Li, Xueyang [1 ]
Zhou, Xiangdong [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
关键词
Multimodal Machine Learning; Representation Learning; Contrastive Learning; Computer Aided Design;
D O I
10.1145/3583780.3614982
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
CAD models are multimodal data where information and knowledge contained in construction sequences and shapes are complementary to each other and representation learning methods should consider both of them. Such traits have been neglected in previous methods learning unimodal representations. To leverage the information from both modalities, we develop a multimodal contrastive learning strategy where features from different modalities interact via contrastive learning paradigm, driven by a novel multimodal contrastive loss. Two pretext tasks on both geometry and sequence domains are designed along with a two-stage training strategy to make the representation focus on encoding geometric details and decoding representations into construction sequences, thus being more applicable to downstream tasks such as multimodal retrieval and CAD sequence reconstruction. Experimental results show that the performance of our multimodal representation learning scheme has surpassed the baselines and unimodal methods significantly.
引用
收藏
页码:1766 / 1776
页数:11
相关论文
共 50 条
  • [21] Understanding self-directed learning behaviors in a computer-aided 3D design context
    Liu, Bowen
    Gui, Wendong
    Gao, Tiantian
    Wu, Yonghe
    Zuo, Mingzhang
    COMPUTERS & EDUCATION, 2023, 205
  • [22] A scene representation based on multi-modal 2D and 3D features
    Baseski, Emre
    Pugeault, Nicolas
    Kalkan, Sinan
    Kraft, Dirk
    Woergoetter, Florentin
    Krueger, Norbert
    2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 63 - +
  • [23] COMPUTER-AIDED 3D SEISMIC INTERPRETATION
    LEMCKE, K
    OIL & GAS JOURNAL, 1982, 80 (42) : 143 - 146
  • [24] PromptLearner-CLIP: Contrastive Multi-Modal Action Representation Learning with Context Optimization
    Zheng, Zhenxing
    An, Gaoyun
    Cao, Shan
    Yang, Zhaoqilin
    Ruan, Qiuqi
    COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 554 - 570
  • [25] FMCS: Improving Code Search by Multi-Modal Representation Fusion and Momentum Contrastive Learning
    Liu, Wenjie
    Chen, Gong
    Xie, Xiaoyuan
    2024 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2024, : 632 - 638
  • [26] MIXCON3D: SYNERGIZING MULTI-VIEW AND CROSS-MODAL CONTRASTIVE LEARNING FOR ENHANCING 3D REPRESENTATION
    Gao, Yipeng
    Wang, Zeyu
    Zheng, Wei-Shi
    Xie, Cihang
    Zhou, Yuyin
    arXiv, 2023,
  • [27] Learning Similarity Measure for Multi-Modal 3D Image Registration
    Lee, Daewon
    Hofmann, Matthias
    Steinke, Florian
    Altun, Yasemin
    Cahill, Nathan D.
    Schoelkopf, Bernhard
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 186 - +
  • [28] TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding
    Zhang, Zhihao
    Cao, Shengcao
    Wang, Yu-Xiong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 21413 - 21423
  • [29] BOOSTED METRIC LEARNING FOR 3D MULTI-MODAL DEFORMABLE REGISTRATION
    Michel, Fabrice
    Bronstein, Michael
    Bronstein, Alex
    Paragios, Nikos
    2011 8TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2011, : 1209 - 1214
  • [30] MoPA: Multi-Modal Prior Aided Domain Adaptation for 3D Semantic Segmentation
    Cao, Haozhi
    Xu, Yuecong
    Yang, Jianfei
    Yin, Pengyu
    Yuan, Shenghai
    Xie, Lihua
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 9463 - 9470