共 50 条
- [2] VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
- [3] Self-Supervised Learning by Cross-Modal Audio-Video Clustering [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [4] Cascaded Siamese Self-supervised Audio to Video GAN [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4690 - 4699
- [5] Self-Supervised Generation of Spatial Audio for 360° Video [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
- [6] Self-supervised learning of class embeddings from video [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3019 - 3027
- [7] The Efficacy of Self-Supervised Speech Models as Audio Representations [J]. HEAR: HOLISTIC EVALUATION OF AUDIO REPRESENTATIONS, VOL 166, 2021, 166 : 90 - 110
- [9] WaveBYOL: Self-Supervised Learning for Audio Representation From Raw Waveforms [J]. IEEE ACCESS, 2023, 11 : 8968 - 8977
- [10] Self-supervised Learning for Endoscopic Video Analysis [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 569 - 578