共 50 条
- [1] Self-Supervised Learning by Cross-Modal Audio-Video Clustering [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [4] Cross-modal self-supervised representation learning for gesture and skill recognition in robotic surgery [J]. International Journal of Computer Assisted Radiology and Surgery, 2021, 16 : 779 - 787
- [5] Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9723 - 9732
- [7] SELF-SUPERVISED LEARNING WITH CROSS-MODAL TRANSFORMERS FOR EMOTION RECOGNITION [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 381 - 388
- [8] Cross-Architecture Self-supervised Video Representation Learning [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19248 - 19257
- [9] CMD: Self-supervised 3D Action Representation Learning with Cross-Modal Mutual Distillation [J]. COMPUTER VISION - ECCV 2022, PT III, 2022, 13663 : 734 - 752